unrealircd

mirror of https://github.com/unrealircd/unrealircd.git synced 2026-07-02 09:06:39 +02:00

Author	SHA1	Message	Date
Bram Matthys	6bbcdfd1b3	Add spamfilter::rule (preconditions), add context to crule parser, and add the first functions: online_time() and reputation(). The more interesting stuff will follow later...	2023-07-06 16:14:26 +02:00
Bram Matthys	4b4562516c	Another attempt at UTF8-aware spamfilter. This was previously tried at 19-apr-2020 in `bc70882bd3` in UnrealIRCd 5.0.5. Sadly it had to be reverted immediately with a quick 5.0.5.1 release, all because of a PCRE2 100% CPU usage. Since then that bug has been fixed, plus another bug. I'm now readding it "as an option" that is marked experimental. Hopefully people test it out and can report back if it works well and then we can make it the default someday. This makes it a runtime setting so makes it much easier to switch back/forth if there are any issues without recompiling anything. Had to use a bit more code now though to handle the recompiling of spamfilters if the setting is changed. Original issue was https://bugs.unrealircd.org/view.php?id=5187 * [Spamfilter](https://www.unrealircd.org/docs/Spamfilter) can be made UTF8-aware. * This is experimental, to enable: `set { spamfilter { utf8 yes; } }`` * Case insensitive matches will then work better. For example, with extended Latin, a spamfilter on `ę` then also matches `Ę`. * Other PCRE2 features such as [\p](https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC5) can then be used. For example you can then set a spamfilter with the regex `\p{Arabic}` to block all Arabic script. Please do use these new tools with care. Blocking an entire language or script is quite a drastic measure. * As a consequence of this we require PCRE2 10.36 or newer. If your system PCRE2 is older than this will mean the UnrealIRCd-shipped-library version will be compiled and `./Config` may take a little longer than usual.	2023-03-22 09:00:31 +01:00
Bram Matthys	329fd07f3a	Revert set::spamfilter::utf8-support from yesterday. This will be for a later release, needs more thought and work.	2022-01-06 18:03:26 +01:00
Bram Matthys	dedff543b5	Add option set::spamfilter::utf8-support which defaults to 'no' for now. When you set this to 'yes' you get more options... See next (modified) copy-paste from April 2020, which had to be reverted because PCRE2 was broken. Now it's an opt-in and hopefully matured a bit. This means: * Case insensitive matches work better in UTF8 now, such as extended Latin. For example, a spamfilter on "ę" now also matches "Ę", while previously it did not catch this. * Other PCRE2 features such as https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC5 are now available. For example you can now set a spamfilter with the regex \p{Arabic} to block all Arabic script, or \p{Cyrillic} to block all Cyrillic script (such as Russian) Use these new tools with care, of course. Blocking an entire language, or script, is quite a drastic measure. All of this was possible because of the new PCRE2_MATCH_INVALID_UTF compile time option which was introduced in PCRE2 10.34. Now, that version turned out to be buggy. As recent as PCRE 10.36 some major bugs were fixed. This also means we now require at least PCRE2 10.36 version so everyone can benefit from this new spamfilter UTF8 feature, IF they enable set::spamfilter::utf8-support, that is. Many systems come with older PCRE2 versions so this means we will fall back to the shipped PCRE2 version in UnrealIRCd. This means ./Config will take a little longer to compile things. For packagers (rpm/deb/ports): if you choose to patch configure to not require such a recent PCRE2, then please do not allow enabling of set::spamfilter::utf8-support since it will likely cause crashes and misbehavior. Check PCRE2 changelog, CTRL+F at PCRE2_MATCH_INVALID_UTF	2022-01-05 18:08:52 +01:00
Bram Matthys	fcf020b99e	It's raining consts...	2021-09-11 09:56:22 +02:00
Bram Matthys	f085173d46	More const char * stuff... mostly in conf.c but also elsewhere.	2021-09-10 15:01:23 +02:00
Bram Matthys	66a51fb659	Massive conversions from 'char ' to 'const char ' and 'char ' to 'const char '	2021-09-10 12:46:31 +02:00
Bram Matthys	8d2f20ef41	Newlog: debug.c, match.c, module.c, random.c and then for api-*.c log out of space in all circumstances.	2021-08-11 17:45:01 +02:00
Bram Matthys	05aeba9ba9	Get rid of Debug(()) function calls. I never use it anyway.	2021-07-12 18:54:38 +02:00
Bram Matthys	d2efe01d9b	Revert "UTF8 support in spamfilter. We now ship with PCRE2 10.34 and require this" This reverts commit `bc70882bd3`.	2020-05-29 08:25:47 +02:00
Bram Matthys	bc70882bd3	UTF8 support in spamfilter. We now ship with PCRE2 10.34 and require this version or newer on the sytem, otherwise we fall back to shipped version. This fixes https://bugs.unrealircd.org/view.php?id=5187 among others. It means: * Case insensitive matches work better in UTF8 now, such as extended Latin. For example, a spamfilter on "ę" now also matches "Ę", while previously it did not catch this. * Other PCRE2 features such as https://www.pcre.org/current/doc/html/pcre2syntax.html#SEC5 are now available. For example you can now set a spamfilter with the regex \p{Arabic} to block all Arabic script, or \p{Cyrillic} to block all Cyrillic script (such as Russian) Use these new tools with care, of course. Blocking an entire language, or script, is quite a drastic measure. All of this was possible because of the new PCRE2_MATCH_INVALID_UTF compile time option which was introduced in PCRE2 10.34. This also means we now require at least that PCRE2 version so everyone can benefit from this new spamfilter UTF8 feature. Many systems come with older PCRE2 versions so this means we will fall back to the shipped PCRE2 version in UnrealIRCd. This means ./Config will take a little longer to compile things. Although there is no indication as of now, but if this feature would break things heavily then it might get reverted or configurable. This is also why it was added just after 5.0.4 release and not right before it, it needs some heavy testing.	2020-04-19 17:45:38 +02:00
GottemHams	fac16fe1c0	match_* functions actually return 1 on match and not 0 :D	2019-12-22 14:48:04 +01:00
Bram Matthys	24c60fd85e	Fix some doxygen tags (eg @notes to @note)	2019-10-26 09:33:09 +02:00
Bram Matthys	33c176e59e	Juse in case pcre2_get_error_message() fails...	2019-10-11 11:17:29 +02:00
Bram Matthys	f2e3712d62	Remove various if's and such that are now unneeded This is part 5 of the memory function / caller changes.	2019-09-14 17:23:07 +02:00
Bram Matthys	9fc1e758ab	Mass change of dst = strdup(str) to safe_strdup(dst,str) but with a manual audit since 'dst' must now be initialized memory. There's still a raw_strdup() if you insist. This is step 2 of X of memory allocation changes	2019-09-14 16:58:01 +02:00
Bram Matthys	de87b439b7	Update memory allocation routines. Step 1 of X.	2019-09-14 16:52:53 +02:00
Bram Matthys	7c6358024c	Add 'natural order' string comparison to core: strnatcmp and strnatcasecmp extern int strnatcmp(char const a, char const b); extern int strnatcasecmp(char const a, char const b); This will be handy for version comparisons. For example they will return -1 (=lower) for things like ("1.4.9", "1.4.10"), unlike strcmp. Also, some loosely related spelling fixes elsewhere.	2019-09-14 08:12:47 +02:00
Bram Matthys	70410b3f33	Remove unused variables (67 files done, will do rest another time).	2019-09-12 17:57:01 +02:00
Bram Matthys	23116d344a	Give structs the same name as the typedefs. Rename aClient to Client, aChannel to Channel, and some more. Third party module coders will love this. But.. it makes things more logical and the doxygen output will look more clean and logical as well. (More changes will follow)	2019-09-11 09:48:00 +02:00
Bram Matthys	5e4c481d93	Yes, strcasecmp is always available, configure.	2019-09-09 16:30:02 +02:00
Bram Matthys	ca2239827e	Get rid of NICK_GB2312/NICK_GBK/NICK_GBK_JAP in config.h. I am not aware of anyone actually using these. So running with this was rather untested (if it worked at all, which I doubt).	2019-09-09 16:20:26 +02:00
Bram Matthys	0d2d4d5bca	Rename match() and _match() to match_simple() -AND- invert the return value of match_simple() and match_esc(). So, developers, be aware, this is how you should use the function in a correct way: if (match_simple("fun", str)) printf("It was fun\n"); Rationale: I've always been annoyed by the inversed logic, even though it was similar to strcmp. So I've reverted it. I could have chosen to maintain match() rather than this match_simple() name, but this way I force (3rd party module) devs to update their function, while otherwise everything would mysteriously fail due to the inverted logic.	2019-08-17 09:20:49 +02:00
Bram Matthys	e1fcc3a667	Rename match() and _match() both to match_simple() and get rid of the "bahamut optimized version". Stage 1 of 2.	2019-08-17 09:15:34 +02:00
Bram Matthys	c01c9248f5	Revert `e428c77c47` (only to try again later)	2019-08-17 09:05:09 +02:00
Bram Matthys	e428c77c47	match() -> match_nuh() and _match -> match_simple()	2019-08-17 08:56:18 +02:00
Bram Matthys	1108b58951	Remove old TRE regex engine. Hasn't been maintained since 2010 and has various outstanding crash and 100% CPU issues. We have been encouraging the PCRE2 engine since the start of UnrealIRCd 4 already. TRE is being phased out of U4 by the end of the year, so we can safely remove it in U5 already.	2019-05-25 10:42:46 +02:00
Bram Matthys	5c30d1af6d	* Badword blocks now use PCRE2 if using regex at all (rare, usually the fast badwords system is used instead) * Code deduplication in src/modules/{chanmodes,usermodes}/censor.c to src/match.c -- which may be moved later again to efuncs. * Add --without-tre: This means USE_TRE will be enabled by default right now but if using --without-tre it will be undef'ed. This so we can prepare for the TRE phase-out in 2020. * Remove include/badwords.h, put contents in include/struct.h	2019-04-05 18:19:23 +02:00
Bram Matthys	704487e124	Fix numerous crash bugs in server to server code. In 3.2.x we didn't fix these bugs since servers are trusted and should send correct commands. In 4.0.x we changed this so we would fix them when we come across such issues at normal priority (not consider them security issues). I now took it a step further and actively checked/looked for these issues and a bunch of them were found. Almost all are NULL pointer dereferences, with some exceptions. * S2S: MODE: check conv_param return value (NULL ptr crash) * S2S: MODE: floodprot: More checks (NULL ptr crash) * S2S: MODE: OOB write of NULL (write NULL past last element in an array) * S2S: NICK: old compat fixes (NULL ptr crash) * S2S: PROTOCTL: Check for double SID= * S2S: SERVER: require at least 3 parameters (NULL ptr crash) * S2S: SJOIN: require at least 3 parameters (NULL ptr crash) * S2S: SJOIN: Fix OOB read (read 1 byte past buffer) * S2S: TKL: validate set_at and expire_at (NULL ptr crash) * S2S: TKL: require at least 9 parameters for spamf, not 8 (NULL ptr crash) * S2S: TKL: ignore invalid spamfilter matching type (remove abort() call) * S2S: TOPIC: querying for topic is not permitted (NULL ptr crash) * S2S: UID: require 12 parameters (NULL ptr crash) * S2S: WATCH: this is not a server command (NULL ptr crash) * Fix OOB read (1 byte beyond string) for timevals. This was reachable from config code, TKL (S2S) and /LINE (Oper). In practice no crash. MODE: make code less confusing (effectively no change) * TRACE: remove strange output in case of 0 lines of output * Fix unimportant memory leak on boot (#4713, reported by dg) * Fix small memory leak upon 'DNS i' (oper only command) * Always work on a copy in clean_ban_mask(). This fixes a bug that could result in a strlcpy(buf, buf, sizeof(buf)). So, overlapping strings, which is undefined behavior.	2017-10-29 11:20:52 +01:00
Bram Matthys	ec9db8fd5f	Move match_user() to module (efunc in m_tkl)	2017-03-18 15:00:34 +01:00
Bram Matthys	03b74f6163	Include string.h / silence warnings.	2016-12-30 15:30:59 +01:00
Bram Matthys	73ec3e3305	Fix IPv6 ban bug + fix a crash bug	2016-07-28 14:15:09 +02:00
Bram Matthys	67c998dc9f	Adding a GLINE or KLINE on usermask@ did not have any effect. Reported by soretna (#4680 ). Tizen, DBoyz and Valdebrick helped tracing the issue. Removed MATCH_USE_IDENT since it had no useful purpose.. for all cases one has to check identd first and then non-identd anyway.	2016-05-22 15:44:28 +02:00
Bram Matthys	58b864edd5	Re-do CIDR and at the same time all the user matching stuff. Introducing match_user(mask, acptr, options): this should be used everywhere rather than the many DIY routines everywhere that create a nick!user@host and then run a match() on it. The match_user() function is not been fully tested yet, at this point I'm happy we can compile again.	2015-07-28 13:26:03 +02:00
Bram Matthys	3cfee0f384	fix a number of /REHASH memleaks	2015-07-10 10:40:07 +02:00
Bram Matthys	e1b7c34c96	Fix various warnings, including one reported by Adam: possible crash in aliases (introduced 1-2wks ago)	2015-06-07 22:07:00 +02:00
Bram Matthys	0eb9c9a36b	PCRE2: enable JIT, free when no longer needed, fix & improve error message when an invalid regex is specified	2015-06-01 10:09:25 +02:00
Bram Matthys	ecd06aa530	Now actually use PCRE2.	2015-06-01 09:51:33 +02:00
Travis McArthur	3b98eac4a9	Remove unnecessary gotos	2015-05-31 21:46:32 -04:00
Bram Matthys	58bd3cf60b	Preparations for #4356 (experimental / on-going): * add general matching framework (aMatch type, unreal_match_xxx functions) * change spamfilter { } block syntax * add support for simple wildcard matching (non-regex, just '?' and '*') This is the initial commit so the new lib is not in yet, 'regex' is not functional (but 'posix' and 'simple' are working), linking has not been fully tested and no warnings are printed yet. IOTW: work in progress!	2015-05-30 21:11:11 +02:00
Bram Matthys	fa9cf506e7	- The '?' wildcard was completely broken in 3.2.4, reported by tabrisnet (#0002797 ).	2006-02-05 17:20:36 +00:00
Bram Matthys	dc19350c70	- Redid glob matching. Escaping is now ripped out for normal bans (as it should be), this means no longer weird issues with +b \ etc not banning nicks with \ in it. ExtBan ~c/~r get special treatment and will use our match_esc [match with escaping] routine, that way you can ban channels such as "#fck" via "+b ~c:#f\ck". Fix triggered by bugreport of vonitsanet (#0002782).	2006-01-30 20:14:39 +00:00
Bram Matthys	6e70facb1e	- Fixed(?) bug due to match() rewrite: we now use our old rules with escaping again, due to the switchover we were accidently using different ones which caused funny kill messages like "You were killed by a.b.c (a!a.b.c (SOMENICK[N\A](?) <- d.e.f))." This also broke some bans in pre2/rc1. Bug reported by HERZ (#0002772).	2006-01-26 14:02:21 +00:00
Bram Matthys	5f272b56e7	- Switched over to an older match() routine based on hybrid, this one is a bit less optimized but is actually understandable and has less bugs. This fixes +b ~c:#c\t not properly matching #ct, reported by Jason (#0002752). Initial results look good, but this needs some good testing ;).	2006-01-23 22:05:50 +00:00
Bram Matthys	8a9bae11fa	- Made '?*' work correctly in wildcard matches, reported by Bugz (#2585 ).	2005-07-05 20:26:18 +00:00
Bram Matthys	8650c97cd3	- No longer cutoff nick upon illegal character -- just reject the whole nick. The nick is still cutoff if the nick is too long. Basically this is the same way as Hybrid does it so it should work ok :). - Added nick character system. This allows you to choose which (additional) characters to allow in nicks via set::allowed-nickchars. See unreal32docs.html -> section 3.16 for a list of available languages and more info on how to use it. Current list: dutch, french, german, italian, spanish, euro-west, chinese-trad, chinese-simp, chinese-ja, chinese. If you wonder why your language is not yet included or why a certain mistake is present, then please understand that we are most likely not experienced (at all) in your language. If you are a native of your language (or know the language well), and your language is not included yet or you have some corrections, then contact syzop@vulnscan.org or report it as a bug on http://bugs.unrealircd.org/	2005-02-19 20:47:41 +00:00
codemastr	2b3fda5a10	Documented the default behavior of snomasks when /mode nick +s is used and added 'const' to the functions in match.c	2004-11-05 21:26:38 +00:00
Bram Matthys	426fbd9663	- Added "extended bans". An idea from SorceryNet ircd. These bans look like ~<type>:<stuff>. Currently the following bans are available: ~q: quiet bans (ex: ~q:!@blah.blah.com). People matching these bans can join but are unable to speak, unless they have +v or higher. ~c: channel bans (ex: ~c:#idiots). People in #idiots are unable to join the channel. ~r: gecos (realname) bans (ex: ~r:Stupid_bot_script). If the realname of a user matches this then (s)he is unable to join. NOTE: an underscore ('_') matches both a space (' ') and an underscore ('_'), so this ban would match 'Stupid bot script v1.4'. These bantypes can also be used in the channel exception list (+e). +e ~r:w00t makes anyone with 'w00t' in their realname able to join, and +e ~c:#admin makes anyone in #admin able to join, etc.. This system allows modules to add extended bantypes too. This feature requires some additional testing, also the module interface will probably be changed in the next few weeks, and perhaps more extended bans will be added before next release.. we'll see...	2003-12-19 23:39:30 +00:00
Bram Matthys	45e2b69a07	- Fixed a match() bug In case of a mask like '*\' it was trying to read out of bounds data.	2003-03-09 03:07:59 +00:00
codemastr	3095782cfd	Various fixes	2003-01-14 21:25:04 +00:00

1 2

59 Commits