connect-flood and maxperip module. This so they actually take
set::default-ipv6-clone-mask into account.
This also changes the maxperip module to a more simple method of
just freeing all entries and rebuilding the hash table on load.
That's necessary since now set::default-ipv6-clone-mask can change.
The OOB write did not happen on file-backed downloads, such as remote
includes. It only happened for memory-backed requests, which are only
these 4 in standard UnrealIRCd: centralblocklist, central spam report,
other spamreport blocks (eg to dronebl) and the log block with
destination webhook. All those 4 cases are very likely to be trusted
web servers, given the nature of the data you are sending to them.
The fix was to extend the size fields everywhere to 64 bits. It was
applied to both URL backends: url_unreal.c and url_curl.c.
The new API feature is a 'max_size' in OutgoingWebRequest, which
defaults to 1MB. This is only used for memory-backed responses,
so not for real file downloads. This fixes not only the reported
bug but also the case where a rogue webserver was unbounded in
terms of what response it could send back, potentially filling
up gigabytes of server memory.
Reported by Link420.
This wasn't done before, because optimizing stuff can always introduce
nice new issues. But is kinda necessary now since the previous way was
very inefficient. This now builds all the necessary buffers for multiline
clients and for non-multiline clients. And then iterates through both
types of clients, sending what they need. Instead of doing it the other
way around.
I had the dillema to either expose the linecache API and have everything
in multiline.c. Or, i do not expose linecache, and we do everything in
send.c. The downside of the latter is that if there is mistake then we
can't simply reload (or unload) the module to solve it. So, I have chosen
to expose the linecache API (sure, less clean) since that leaves us with
options if we screw up, plus it means everything related to multiline
sending is nicely in multiline.c, which is i guess just as good as an
argument as well ;)
This way you can limit the number of pastes going on in a channel, as
this is from everyone in that channel (like 'm') not individual (like 't').
If it is exceeded then we will simply reject the BATCH, similar to
how action d(rop) works for some other subtypes. You won't see the paste
on the channel, only the sending user receives an error (MULTILINE_PASTE_LIMIT).
Small note: a multiline BATCH of just 2 lines is not considered a paste.
We consider a multiline of 3+ lines as a paste. I think that is reasonable,
since a two-line-multiline is not that much of a paste ;).
In the default anti-flood profile (+F normal) we also set 2p per 15s,
so this means channels are by default limited to 2 pastes per 15s max.
Of course, you can override this with +f [4p]:15 or whatever you like.
In terms of +F profiles, the defaults are (maximum x pastes per 15 seconds):
very-strict: 1p
strict: 1p
normal: 2p
relaxed: 2p
very-relaxed: 3p
and 7 for unknown-users (with max-bytes 5250 and 1500 respectively). This
allows pasting a short snippet of code, config file, text from a site, etc.
With multiline you have the guarantee that:
1) You will see the entire text with no delay between lines
2) You won't see another persons chat half-way through such a paste
3) For multiline supporting clients it is now clear that all the text
belongs to each other, which can make selecting/copying it easier.
This basically means short snippets/pastes like that can be completely on
IRC again. No need for a pastebin for it. Though, you may still need such
a service if you are pasting more lines.
Regarding the implementation in UnrealIRCd:
* Clients without multiline get individual fallback lines (concat lines
merged, blank lines skipped, as per spec). And we know that clients like
weechat - which does support multiline - also shows all lines and not
only a few plus snippet style "[.."]. That is another reason for only
allowing 15 lines by default and not something much more. Otherwise all
those clients would get a big wall of text, which just sucks.
* Spamfilter (also) runs on the full text of all lines together, so
splitting a phrase across lines does not evade spamfilter.
* Fakelag: a client can send the BATCH start+PRIVMSG (or NOTICE)+BATCH end
at full speed. We impose no fake lag there. Also, the multiline default
max-lines and max-bytes are lower than the example class::recvq of 8000,
so should be perfectly safe. If the entire BATCH is accepted then we
will impose fake-lag afterwards, with a cap of 15 seconds maximum.
If the BATCH is rejected, we impose half the fakelag plus 2sec.
* If the time between BATCH start and BATCH end is more than 15 seconds
then the BATCH is rejected (set::multiline::batch-timeout).
* The BATCH is atomic (either you see it all, or you see none of it):
* When the client sends it to server, it is buffered first.
* Only after the batch close the server indicates if it is accepted
or rejected. This has various reasons, two of them are: 1) The client
is going to send everything in one go anyway and not wait for a
response between each PRIVMSG, and 2) we can't do many checks in the
buffering stage and skip those after, that would cause a TOCTOU
problem (eg. a banned user still being able to speak).
* If any line gets rejected due to spamfilter or other case
(eg +c, +b ~text with block, etc etc), the entire batch is rejected
* Locally we deliver all or nothing (as said)
* S2S we buffer the batch as well, so if a server splits after having
received 10 lines out of 15, then clients will not see anything.
* We send max-lines and max-bytes, this is the hard upper limit.
* A multiline can still be limited more tight if:
* +f with 't' or 'm' restricts to fewer lines,
eg +f [5t]:15, which means max 5 lines per 15 seconds,
means the max accepted multiline is 5 for that channel.
* +F works the same, except that default +F normal does not
have a 't' at the moment and 'm' is very high (50) so
practically not limited by default.
* There will be a future +f flood subtype for some more control
TODO: we will send CAP NEW on unknown-users <-> known-users to
indicate the new max-lines value if you transition security groups
TODO: chat history does not yet include multiline batches.
As usual, this is mostly for configuration templates that you use for
multiple servers, that sort of things, eg.
@if !environment("ADMIN")
@error "Environment variable ADMIN is not set"
@endif
This also adds a change in conf.c so @define, @error and
@warning are skipped in @if blocks that evaluate to false
(that's obviously what everyone wants :D). So that fixes a
previous bug with @define in @if.
For the extbans that we ship, no problem, as this isn't used in
any of our extbans, but for third party it may matter, or for us
in the future.
Just something we came across while looking into the issue from
previous commit.
user is in known-users or in unknown-users. Not used anywhere yet.
Every 2 minutes we rescore all users. Or more specifically: every
5 seconds we rescore 1/24th of all users. That's the slow update path.
On certain events that cause a likely/possible transition, we update
the cache immediately. At the moment that is on IP change and account
login/logout. More will be added later.
New RPC methods:
- security_group.list: List all security groups
- security_group.get: Get details of a specific security group
- connthrottle.status: Get full connection throttle status, counters, and config
- connthrottle.set: Enable/disable connection throttling
- connthrottle.reset: Reset connection throttling counts
This also adds json_expand_mask_list(), json_expand_name_list(), and
json_expand_nvplist() to src/json.c for reuse by RPC modules.
This so we can use fast(er) techniques here and there.
New functions are:
channel_has_invisible_users(client)
set_user_invisible(client, channel, 1|0)
Existing functions:
invisible_user_in_channel(client, channel)
user_can_see_member(user, target, channel)
user_can_see_member_fast()
This is work in progress, although the tests seem to pass atm.
allow {
mask *;
password "secret";
password "letmein";
}
This is always an "OR" type of match, any match means you pass.
I was actually doing this for the dual-cert stuff from previous commit,
where this can come in handy:
link irc1.example.org {
...
password "AHMYBevUxXKU/S3pdBSjXP4zi4VOetYQQVJXoNYiBR0=" { spkifp; };
password "jNw8P4QMg9tqjEJ4/lFikXBNHdIGSeN2B4/T322VjIo=" { spkifp; };
...
}
This function was added a short while ago, and well it seems to be
able to be possible in a module. Since the 'isupport' module is mandatory
and this is ISUPPORT related, it is the right place.
Can't move isupport_snapshot() because modules might not be loaded yet
or things are currently unloading, i think. Not important anyway.
Also, make things work if there are more changes than would fit
on one isupport line. Although I didn't really test this..
Ended up splitting things in 3 helper functions to avoid some
goto and/or duplicate code and stuff. The alternative was, surprisingly,
even more ugly.
to all ISUPPORT tokens, instead of only CHANMODES, PREFIX and STATUSMSG.
E.g. changing set::min-nick-length would also broadcast the change.
Technically we will call isupport_snapshot() before the rehash (or before
delayed module unload) and then after modules were reloaded/unloaded we
call isupport_check_for_changes(). This uses the ISUPPORT system in a
general way, so works the same for all tokens.
https://www.unrealircd.org/docs/Set_block#set::send-isupport-updates
TODO: Deal with more than X changes (is currently an abort, crash)
TODO: batch for draft/extended-isupport
always available (also w/cURL) so it can be used by the crash
reporter. And delete duplicate code crashreport_init_tls()
function since it is now unused.
As always, duplicate code causes problems when one is changed and
the other is not. This also happened here, where the curves or
TLS groups where set in url_unreal but not in the crash reporter.
Now that one is minor, but the danger is clear.
users by server port (eg 6667, 6697, 8000, etc).
This also adds security-group::exclude-server-port for consistency.
And in crules the function server_port() returns the server port number,
so you can use rule 'server_port()>6690' for example.
Note that for remote clients this will only work after previous
commit (b2d0ec1af3) is loaded on all
servers, otherwise all remote clients are seen as having a server_port
of zero (0). Though you probably usually only care about this on local
users anyway.
in the EFunction but not in the actual function. That's bad since it
means the "const guarantee" got lost. And one or two similar cases with
incorrect parameter types and mismatching return types. This was
found with some analyzer, we had no bugreports with regards to this.
That is, if the set::best-practices::trusted-cert check is on and passed
("certificate is valid and issued by a trusted CA") then we also
do this new set::best-practices::trusted-cert-valid-hostname check:
/* If the trusted-cert check passes, then we do another check to see if
* the certificate is valid for me::name. Since users usually connect to your
* server by your server name it is important for the certificate to be
* valid for that name. Unless you really only care about e.g. irc.example.net,
* and not about individual irc2.example.net server names, in which case you
* can turn this off, but not sure if that is good practice.
*/
trusted-cert-valid-hostname yes;
On the incoming side it was correctly identified as link sec 2,
but on the outgoing side the localhost check failed and caused link sec 1 or 0.
Bug has beent here for a while but I don't think many people
link two UnrealIRCd servers over localhost that are on production
(i do, when dev'ing, but then I don't care about linksec, obviously)
Also, this wouldn't flag services from 2 to 0 because this bug only
affected outgoing UnrealIRCd server connections.
suggest to use Let's Encrypt.
This can be turned off via set::best-practices::trusted-cert, see
https://www.unrealircd.org/docs/Set_block#set::best-practices
Oh yeah, and this only works at OpenSSL 1.1.0 and higher, i didn't bother
with people running ancient versions.
Example output:
*** SPAMINFO ***
This will show the original text and the deconfused text which can be used in a spamfilter block with input-conversion deconfused;
Original spam text: ẔŽŽẐ𝞕ȤℤΖℨℨ𝒁𝓩ẒŹƵᏃŻẒŽℨŹ𝒵𝛧Ż𝝛𝛧ℨℤ𝜡Ƶ𝞕𝘡ŹẐ𝑍ẔẐẐΖ𝜡Ẕ𝜡Ẕ𝞕ꓜ𝚭ᏃẐẔ𝙕
Deconfused spam text: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
AntiMixedUTF8 points: 64
Number of Unicode characters in total: 50
Number of different Unicode blocks used: 8
Unicode Block breakdown (name: bytes [capped at 255]):
- Latin Extended-A: 8
- Latin Extended-B: 3
- Greek and Coptic: 2
- Cherokee: 2
- Latin Extended Additional: 12
- Letterlike Symbols: 6
- Lisu: 1
- Mathematical Alphanumeric Symbols: 16
Make match_spamfilter use the clictx->textanalysis->deconfused rather than
calculating its own. The latter will probably disappear altogether.
Unrelated but also fixed: properly set e->unicode_blocks.
switches like antimixedutf8 did, and counts the number of characters
used per unicode block. Potentially more can be added later, this is
flexible and modules can add stuff (..well not yet.. the struct is
missing some members..).
Use it from antimixedutf8 so that it now uses the new code, which is
similar to what I made and then reverted in July 2023:
https://github.com/unrealircd/unrealircd/commit/3e2f668f10fccedfd035526d7b20d7ca6819a8ae
..except that it now calculated in src/modules/utf8functions.c.
But yeah, this needs more testing and possibly (default) score
adjustments to deal with false positives !! And a warning in release notes :D
Put the text analysis in ClientContext member textanalysis,
so typically accessed through clictx->textanalysis.
Note that this struct can (and often is) NULL, for example if it is
a remote client, if it is not a PRIVMSG/NOTICE (will improve later)
or if the utf8functions module is not loaded (to keep things optional).
BREAKING CHANGE is that ClientContext is now passed in the
HOOKTYPE_CAN_SEND_TO_CHANNEL and HOOKTYPE_CAN_SEND_TO_USER hooks.
So HOOKTYPE_CAN_SEND_TO_USER prototype changed from:
int hooktype_can_send_to_user(Client *client, Client *target, const char **text, const char **errmsg, SendType sendtype);
To:
int hooktype_can_send_to_user(Client *client, Client *target, const char **text, const char **errmsg, SendType sendtype, ClientContext *clictx);
And HOOKTYPE_CAN_SEND_TO_CHANNEL prototype changes from:
int hooktype_can_send_to_channel(Client *client, Channel *channel, Membership *member, const char **text, const char **errmsg, SendType sendtype);
To:
int hooktype_can_send_to_channel(Client *client, Channel *channel, Membership *member, const char **text, const char **errmsg, SendType sendtype, ClientContext *clictx);
A side-affect of this change for antimixedutf8 purposes is that,
while the analysis is only done once per line, the 'actions' are
performed for each target, so the action will run 4 times for
"PRIVMSG a,b,c,d :text" although that may not be important in
practice. Just mentioning.
I started work on this back then but didn't finalize it. Now I
have to figure out what was left to be done :D. Other than the
obvious case of seeing some debugging code that prints out for
every converted character. Not yet visible / usable by end-users!
It now passes 'clictx' which at the moment only has clictx->cmd which
points to the command handler. So only useful in very few cases where
you have like a generic command handler and thus have no idea for which
command you are being called. In the future, with this new ClientContext
struct, we can simply add new fields to the struct without breaking
things in the core and in (third party) modules.
If you use the magic functions in your modules CMD_FUNC(cmd_mycmd),
OVERRIDE_FUNC(myoverride), CALL_NEXT_COMMAND_OVERRIDE() and such then
you shouldn't have any compile errors as these will use the correct
prototypes and variable names automatically. In a few cases you can't
use these, in which case you will need to update your modules.
mostly with regards to memory leaks if duplicate config directives are used.
Eg using allow::password twice in the same allow block, or using
link::outgoing::tls-options twice in the same link block. Unusual stuff.
Eg: vhost "$operlogin@$operclass.example.net";
Also add potentially_valid_vhost() function which can be used in
config code to ignore invalid $vars. Then at runtime you use the
real valid_vhost() function after variable expansion by
unreal_expand_string().
and use it not only from vhost { } block code but also for like
blacklist::reason.
This so the same variables with the same names are available at
those places.
Supported are:
$nick, $username, $realname, $ip, $hostname, $server, $account,
$operlogin, $operclass, $country_code (xx for unknown),
$asn (0 for unknown).
* Convert to use module-based config handling
* Split part of VHOST command into do_vhost() for later
* Use AppendListItem instead of AddListItem so they are in config-order.
This is not really important atm but will matter later if we go auto.
* No other code changes at this point
This so if there is ever an issue, we can hot-patch it. This affects
exit_client(), exit_client_fmt(), exit_client_ex(), banned_client(),
and various (internal) help functions.
This also means you cannot call these functions during TEST/INIT (eg
during REHASH) since the 'quit' module which provides these modules
may not be loaded yet. I don't think that's a situation/problem but
this needs some more testing.