Add a little fake lag based on history result: 400ms for 50 lines
under normal conditions where 50 lines = 50 lines. But this can go
up to 5000ms for worst-case amplification attacks where requesting
50 lines actually returns 50*15=750 lines when each line is a multiline
with max-lines, which gets you close to 350k+. This would only happen
if someone on the channel is doing evil stuff (with presumably consent
of the ops).
Also guard against hiting max sendq. If we are too close, then we
reject the CHATHISTORY request rather than quiting with "Max SendQ
exceeded". This protects against an attack where someone would be
tricked into joining a channel with amplified history (as explained
in previous paragraph), their client would do an automatic CHATHISTORY
request and then the victim would exceed max sendq and thus be killed.
And yes, this and maaaaany other multiline + history interactions
and many "buts" and security/flood concerns are why this implemtnation
took (and still takes) a lot of hours to get right :D.
For +H we now temporarily allow overshooting. This only matters for low limits.
Multiline batches are atomic so we have to choose to keep them as a whole
or remove the complete batch. So if +H 5:1h and the last message was a 15-line
multiline event, what do we do? We allow temporary overshooting to store the
15 lines. As said, the alternative would be to store 0 lines which would be
worse in terms of functionality, and the small overshoot is defensible.
For higher limits (where the +H line limit is bigger than multiline max-lines),
we always stay under the +H limit. Eg if all history in a channel consists
of 15 line multiline events and we have +H 100 then we will store 90, not 105.
It's only for +H linelimit < max-lines that this matters, because there the
zero-lines consequence sucks too much ;)
This way you can limit the number of pastes going on in a channel, as
this is from everyone in that channel (like 'm') not individual (like 't').
If it is exceeded then we will simply reject the BATCH, similar to
how action d(rop) works for some other subtypes. You won't see the paste
on the channel, only the sending user receives an error (MULTILINE_PASTE_LIMIT).
Small note: a multiline BATCH of just 2 lines is not considered a paste.
We consider a multiline of 3+ lines as a paste. I think that is reasonable,
since a two-line-multiline is not that much of a paste ;).
In the default anti-flood profile (+F normal) we also set 2p per 15s,
so this means channels are by default limited to 2 pastes per 15s max.
Of course, you can override this with +f [4p]:15 or whatever you like.
In terms of +F profiles, the defaults are (maximum x pastes per 15 seconds):
very-strict: 1p
strict: 1p
normal: 2p
relaxed: 2p
very-relaxed: 3p
and 7 for unknown-users (with max-bytes 5250 and 1500 respectively). This
allows pasting a short snippet of code, config file, text from a site, etc.
With multiline you have the guarantee that:
1) You will see the entire text with no delay between lines
2) You won't see another persons chat half-way through such a paste
3) For multiline supporting clients it is now clear that all the text
belongs to each other, which can make selecting/copying it easier.
This basically means short snippets/pastes like that can be completely on
IRC again. No need for a pastebin for it. Though, you may still need such
a service if you are pasting more lines.
Regarding the implementation in UnrealIRCd:
* Clients without multiline get individual fallback lines (concat lines
merged, blank lines skipped, as per spec). And we know that clients like
weechat - which does support multiline - also shows all lines and not
only a few plus snippet style "[.."]. That is another reason for only
allowing 15 lines by default and not something much more. Otherwise all
those clients would get a big wall of text, which just sucks.
* Spamfilter (also) runs on the full text of all lines together, so
splitting a phrase across lines does not evade spamfilter.
* Fakelag: a client can send the BATCH start+PRIVMSG (or NOTICE)+BATCH end
at full speed. We impose no fake lag there. Also, the multiline default
max-lines and max-bytes are lower than the example class::recvq of 8000,
so should be perfectly safe. If the entire BATCH is accepted then we
will impose fake-lag afterwards, with a cap of 15 seconds maximum.
If the BATCH is rejected, we impose half the fakelag plus 2sec.
* If the time between BATCH start and BATCH end is more than 15 seconds
then the BATCH is rejected (set::multiline::batch-timeout).
* The BATCH is atomic (either you see it all, or you see none of it):
* When the client sends it to server, it is buffered first.
* Only after the batch close the server indicates if it is accepted
or rejected. This has various reasons, two of them are: 1) The client
is going to send everything in one go anyway and not wait for a
response between each PRIVMSG, and 2) we can't do many checks in the
buffering stage and skip those after, that would cause a TOCTOU
problem (eg. a banned user still being able to speak).
* If any line gets rejected due to spamfilter or other case
(eg +c, +b ~text with block, etc etc), the entire batch is rejected
* Locally we deliver all or nothing (as said)
* S2S we buffer the batch as well, so if a server splits after having
received 10 lines out of 15, then clients will not see anything.
* We send max-lines and max-bytes, this is the hard upper limit.
* A multiline can still be limited more tight if:
* +f with 't' or 'm' restricts to fewer lines,
eg +f [5t]:15, which means max 5 lines per 15 seconds,
means the max accepted multiline is 5 for that channel.
* +F works the same, except that default +F normal does not
have a 't' at the moment and 'm' is very high (50) so
practically not limited by default.
* There will be a future +f flood subtype for some more control
TODO: we will send CAP NEW on unknown-users <-> known-users to
indicate the new max-lines value if you transition security groups
TODO: chat history does not yet include multiline batches.
in NameList, Tag, Watch and HistoryLogLine.
This does mean the allocation routines need a +1 everywhere, but
I think I got all of them. I also don't see them being used directly
in such a way in 3rd party modules (which is logical, as they
should use the API and not allocate such structs directly).
Also, SpamExcept has been removed as it was not used anywhere.
since the message/notice would not make it through either.
This also means someone can no longer iterate through users to see who
is +D/+R by sending a "silent" TAGMSG. (Silent in the sense that the
end-user usually would not have noticed)
Suggested in https://bugs.unrealircd.org/view.php?id=6579 by zw32h (I think)
This also means HOOKTYPE_CAN_SEND_TO_USER now allows you to NOT to
set errmsg, to silently drop a message. Previously we would crash
deliberately on such a situation to enforce that all modules would
set a proper errmsg.
user is in known-users or in unknown-users. Not used anywhere yet.
Every 2 minutes we rescore all users. Or more specifically: every
5 seconds we rescore 1/24th of all users. That's the slow update path.
On certain events that cause a likely/possible transition, we update
the cache immediately. At the moment that is on IP change and account
login/logout. More will be added later.
This so we can use fast(er) techniques here and there.
New functions are:
channel_has_invisible_users(client)
set_user_invisible(client, channel, 1|0)
Existing functions:
invisible_user_in_channel(client, channel)
user_can_see_member(user, target, channel)
user_can_see_member_fast()
This is work in progress, although the tests seem to pass atm.
things by making the keys with the most lookups first, e.g. "reputation",
"geoip", "certfp". This order is based on actual lookup counts during a
quick test with 250 clones doing some typical IRC traffic.
Key: Lookups: Position before: After split: After split+order:
"reputation" 20362 37 14 1
"geoip" 10555 44 15 2
"certfp" 9264 23 8 3
"webirc" 7407 27 10 4
"websocket" 7110 55 19 5
We could also consider going for a hash table, but this may be "good enough" for now.
up moddata_client_get() etc -> findmoddata_byname().
Apparently we have 52 moddata registrations (that is without 3rd party modules)
so otherwise it is a loooong linked list.
This function was added a short while ago, and well it seems to be
able to be possible in a module. Since the 'isupport' module is mandatory
and this is ISUPPORT related, it is the right place.
Can't move isupport_snapshot() because modules might not be loaded yet
or things are currently unloading, i think. Not important anyway.
Also, make things work if there are more changes than would fit
on one isupport line. Although I didn't really test this..
Ended up splitting things in 3 helper functions to avoid some
goto and/or duplicate code and stuff. The alternative was, surprisingly,
even more ugly.
* Calling from source is now in a separate function: int can_use_nick(Client *client, const char *nick)
* For hooks: don't free the reject reason, must use static storage like all other hooks
(TODO: clarify in all hooks?)
* Move it up a bit, right before find_qline
TODO (not necessarily me :D):
* Make it an efunc
* Also call it from some other places that do find_qline, like rpc/user.c
* You may want to prod 3rd party modules like SANICK
Example output:
*** SPAMINFO ***
This will show the original text and the deconfused text which can be used in a spamfilter block with input-conversion deconfused;
Original spam text: ẔŽŽẐ𝞕ȤℤΖℨℨ𝒁𝓩ẒŹƵᏃŻẒŽℨŹ𝒵𝛧Ż𝝛𝛧ℨℤ𝜡Ƶ𝞕𝘡ŹẐ𝑍ẔẐẐΖ𝜡Ẕ𝜡Ẕ𝞕ꓜ𝚭ᏃẐẔ𝙕
Deconfused spam text: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
AntiMixedUTF8 points: 64
Number of Unicode characters in total: 50
Number of different Unicode blocks used: 8
Unicode Block breakdown (name: bytes [capped at 255]):
- Latin Extended-A: 8
- Latin Extended-B: 3
- Greek and Coptic: 2
- Cherokee: 2
- Latin Extended Additional: 12
- Letterlike Symbols: 6
- Lisu: 1
- Mathematical Alphanumeric Symbols: 16
switches like antimixedutf8 did, and counts the number of characters
used per unicode block. Potentially more can be added later, this is
flexible and modules can add stuff (..well not yet.. the struct is
missing some members..).
Use it from antimixedutf8 so that it now uses the new code, which is
similar to what I made and then reverted in July 2023:
https://github.com/unrealircd/unrealircd/commit/3e2f668f10fccedfd035526d7b20d7ca6819a8ae
..except that it now calculated in src/modules/utf8functions.c.
But yeah, this needs more testing and possibly (default) score
adjustments to deal with false positives !! And a warning in release notes :D
Put the text analysis in ClientContext member textanalysis,
so typically accessed through clictx->textanalysis.
Note that this struct can (and often is) NULL, for example if it is
a remote client, if it is not a PRIVMSG/NOTICE (will improve later)
or if the utf8functions module is not loaded (to keep things optional).
BREAKING CHANGE is that ClientContext is now passed in the
HOOKTYPE_CAN_SEND_TO_CHANNEL and HOOKTYPE_CAN_SEND_TO_USER hooks.
So HOOKTYPE_CAN_SEND_TO_USER prototype changed from:
int hooktype_can_send_to_user(Client *client, Client *target, const char **text, const char **errmsg, SendType sendtype);
To:
int hooktype_can_send_to_user(Client *client, Client *target, const char **text, const char **errmsg, SendType sendtype, ClientContext *clictx);
And HOOKTYPE_CAN_SEND_TO_CHANNEL prototype changes from:
int hooktype_can_send_to_channel(Client *client, Channel *channel, Membership *member, const char **text, const char **errmsg, SendType sendtype);
To:
int hooktype_can_send_to_channel(Client *client, Channel *channel, Membership *member, const char **text, const char **errmsg, SendType sendtype, ClientContext *clictx);
A side-affect of this change for antimixedutf8 purposes is that,
while the analysis is only done once per line, the 'actions' are
performed for each target, so the action will run 4 times for
"PRIVMSG a,b,c,d :text" although that may not be important in
practice. Just mentioning.
I started work on this back then but didn't finalize it. Now I
have to figure out what was left to be done :D. Other than the
obvious case of seeing some debugging code that prints out for
every converted character. Not yet visible / usable by end-users!
Also fix documentation for ~10 hooks to mention the hook name.
Obviously, the maxperip module is loaded by default (in modules.default.conf)
but it is nice to have the 400+ lines contained in a separate module
rather than being in the nick module that does NICK/UID handling.
Will look at moving more later..
It now passes 'clictx' which at the moment only has clictx->cmd which
points to the command handler. So only useful in very few cases where
you have like a generic command handler and thus have no idea for which
command you are being called. In the future, with this new ClientContext
struct, we can simply add new fields to the struct without breaking
things in the core and in (third party) modules.
If you use the magic functions in your modules CMD_FUNC(cmd_mycmd),
OVERRIDE_FUNC(myoverride), CALL_NEXT_COMMAND_OVERRIDE() and such then
you shouldn't have any compile errors as these will use the correct
prototypes and variable names automatically. In a few cases you can't
use these, in which case you will need to update your modules.
Previously if a new history item was added (because someone sent a message)
we would always append at the end of chat history buffer of the channel.
Now we put the message at the position decided by the "time" message tag,
which could be at the end but also slightly before that.
* Upside: should result in a consistent chat history on all servers
* Downside: if your server time is off for several seconds then it
could look a little weird. Then again, it would already have looked weird
in real live chat with timestamps and when replaying chat history probably.
Also add some simple optimizations: in the log line object we now have direct
pointers to the msgid and time strings, so the code doesn't need to do a
find_mtag() all the time. This should lower CPU usage during log playback
and also makes things more simple in the source code.
I did some testing with various history injection variants but this needs
more extensive testing.
and use it not only from vhost { } block code but also for like
blacklist::reason.
This so the same variables with the same names are available at
those places.
Supported are:
$nick, $username, $realname, $ip, $hostname, $server, $account,
$operlogin, $operclass, $country_code (xx for unknown),
$asn (0 for unknown).
This so if there is ever an issue, we can hot-patch it. This affects
exit_client(), exit_client_fmt(), exit_client_ex(), banned_client(),
and various (internal) help functions.
This also means you cannot call these functions during TEST/INIT (eg
during REHASH) since the 'quit' module which provides these modules
may not be loaded yet. I don't think that's a situation/problem but
this needs some more testing.
This to replace the scattered IP setting. It is very important to always
use set_client_ip() from this point. Everywhere!
Also, in addition to client->ip, this adds client->rawip that contains
the IP in network byte order. In older UnrealIRCd versions we always had
the raw IP but not the IP as a string, so we moved to IP as a string,
but it can be useful to have both in terms of optimizations.
Of course, then the client->ip and client->rawip always need to 100% match,
hence the set_client_ip().
This also changes IsIPV6() to do A BUGFIX, it changes it from:
* if local user is the user connected over IPv6? Otherwise, does it have ':' in the IP?
To:
* check if the IPv6 flag is set (which is set if IP contains ':')
This may seem insignificant but it means that for spoofed IP addresses,
such as WEBIRC or transparant proxy, we use the correct transport.
Previously, if the proxy was IPv6 then even if the spoofed user was using
IPv4, the ident check would still be tried over IPv6. That sort of fun.
From now in, in such a situation client->local->socket_type will be
SOCKET_TYPE_IPV6 but since client->ip (and rawip) will contain IPv4
the IsIPV6() will actually return false, as it should be.
Also, in the HOOKTYPE_IP_CHANGE, enforce that if HOOK_DENY is returned,
the the user is killed by dead_link(). The user must be killed because
that is what we expect, and you cannot use exit_client() because from
some code paths that would be too much freed structures / hassle,
as a comment in src/modules/connect-flood.c correctly states:
/* There are two reasons why we can't use exit_client() here:
* 1) Because the HOOKTYPE_IP_CHANGE call may be too deep.
* Eg: read_packet -> webserver_packet_in ->
* webserver_handle_request_header -> webserver_handle_request ->
* RunHook().... and then returning without touching anything
* after an exit_client() would not be feasible.
* 2) Because in HOOKTYPE_ACCEPT we always need to use dead_socket
* if we want to print a friendly message to TLS users.
*/
Note that this is still a dumb interface and not a real proper
authentication framework.
This adds HOOKTYPE_SASL_AUTHENTICATE and HOOKTYPE_SASL_MECHS and
also provides 3 functions: sasl_succeeded(), sasl_failed() and
a helper function decode_authenticate_plain() for AUTHENTICATE PLAIN.
set::central-blocklist::spamreport and ::spamreport-enabled are now GONE.
We now require a normal spamreport block, just like for other spamreport
functionality. So, if you want to enable this feature, use:
spamreport unrealircd { type central-spamreport; }
See https://www.unrealircd.org/docs/Central_spamreport for all info.
You can use CBL with central spamreport or central spamreport without CBL.
All explained at that URL.
The `watch-check` function now has a new argument which can be used to pass data to watch_notify callbacks.
New `watch_add` and `watch_del` hooks are called whenever new entries are created or removed.
New `monitor_notification` hook is called whenever a RPL_MONONLINE or RPL_MONOFFLINE is being sent, so a module can add its own notification besides it.
The LoadPersistent*()/SavePersistent*() functions caused moddata to be
tagged with ->unloaded=1. Though it seems it caused no real issues this
is not good... we now properly tag them as 0 and the like. Also did a
code cleanup / overhaul on that system as well.
For other ModData we now handle the case where a module is loaded with
with a newer version and that newer version is no longer having certain
moddata, eg the name changed or it no longer needs it.
KNOWN ISSUE:
Unfortunately we cannot call the free function for the old moddata that
is no longer being handled by the newer version of the module, since the
module is already unloaded. So this will result in a memory leak, but
not in a crash.
KNOWN ISSUE:
Similarly, for SavePersistentPointer() there is a free function, again
this is called just fine if the module is permanently unloaded but NOT
if the module is reloaded with the same name and no longer is interested
in the persistent pointer object. Again, here too, that would result
in a memory leak but not in a crash.
Fortunately the "known issues" are rare. Fixing these is impossible
with the current module API because modules are unloaded after MOD_TEST
and before MOD_INIT, and only after MOD_INIT we know which moddata
is handled by the new version of the module. To change that we would
need to keep the old module around until after MOD_INIT of the new
module (so we can call free functions in the old module), but that
means delaying the MOD_UNLOAD for the old modules until after MOD_INIT
of the new modules, which changes the sequence too much that i don't
dare to do that. For example, it would mean a database save routine
in the old module would only be called after MOD_INIT finished in the
new module, which may be unexpected since right now MOD_UNLOAD is
called before MOD_INIT and maybe the db loading is done in MOD_INIT,
which would need to be moved to MOD_LOAD. That's just one example,
there may be others. I think such a change can only be done on a major
UnrealIRCd version change, so we will have to live this for now.
As said, fortunately it is a corner case.
action { set REPUTATION--; } and similar.
Also enhancement to reputation S2S traffic, to support decreasing:
*
+ * Since UnrealIRCd 6.0.2+ there is now also asterisk-score-asterisk:
+ * :server REPUTATION 1.2.3.4 *2*
+ * The leading asterisk means no reply will be sent back, ever, and the
+ * trailing asterisk will mean it is a "FORCED SET", which means that
+ * servers should set the reputation to that value, even if it is lower.
+ * This way reputation can be reduced and the reducation can be synced
+ * across servers, which was not possible before 6.0.2.
+ *
So if you are actually decreasing reputation, you need all servers on
6.0.2 or higher for it to work properly, otherwise the other servers
don't decrease it, and next connect the highest wins again, etc.
This is a mandatory module to load, and included in modules.default.conf.
This also meant that the crule_test() etc efunctions are available
before running config test routines, so we now have a flag for
early efuncs. I guess we could consider doing that for all efuncs
though, so not sure if this flag is really needed.
that was added late in 6.1.1 development to fix a crash with removing
websocket listeners. Now replaced with a generic HOOKTYPE_CONFIG_LISTENER
that is not only called for removed listeners, but for all listeners.
a function called start_dns_and_ident_lookup(). This can then
be easily called from other places as well, like the code k4be
did in src/modules/websocket.c to handle proxies.
Side-effect is that ident lookups would now be done, if we are
configured to do so, for forwarded webirc stuff (not that I
think many people use that feature at the moment...).