« Project home page
my little forum
Log in
Register
Search:
Back to the entry by Micha
Post reply
Reply to the message by
Micha
Name:
E-mail:
(optional, won't be displayed directly)
Leave this field empty:
Homepage:
(optional)
Leave this field empty:
Location:
(optional)
Remember me (cookie)
Category:
General
Project organisation
Technics
Design/Themes
Features
Development
Todo
Bugs
German / Deutsch
Spanish / Español
French / Français
Accessibility/UX
Subject:
Formatting help
skip to input
format text bold
[b]bold text[/b]
format text italic
[i]italic text[/i]
insert hyperlink
[link=http://example.com/]link text[/link] / [link]http://example.com/[/link]
set text color
[color=#rgb]colored text[/color]
font size
[size=small]small text[/size]
[size=large]large text[/size]
insert list
[list][*]list item[/list]
insert image
[img]http://example.com/image.jpg[/img]
left: [img=left]http://example.com/image.jpg[/img]
right: [img=right]http://example.com/image.jpg[/img]
thumbnail: [img=thumbnail]http://example.com/image.jpg[/img]
thumbnail left: [img=thumbnail-left]http://example.com/image.jpg[/img]
thumbnail right: [img=thumbnail-right]http://example.com/image.jpg[/img]
upload image
upload image ...
insert TeX code
[tex]TeX code[/tex]
insert code
[inlinecode]code[/inlinecode]
[code]code[/code]
[code=css]code[/code]
[code=html]code[/code]
[code=javascript]code[/code]
[code=perl]code[/code]
[code=php]code[/code]
[code=sql]code[/code]
[code=xml]code[/code]
:-)
;-)
:-P
:-D
:-|
:-(
:yes:
:no:
:ok:
:lol:
:lol2:
:lol3:
:cool:
:surprised:
:angry:
:crying:
:waving:
:confused:
:lookaround:
:clap:
:love:
:tick:
Message:
> Hello > > some information can be found at the [link=https://nasauber.de/opensource/b8/readme.php]developer website[/link]. > > > I'm curious to see the amount of false positives and negatives and what time and count of words it needs to get stable. > > Yes, me too. It depends on the HAM AND SPAM frequency of a forum. If you never train SPAM, all entries will classified as HAM. Training for both, detecting spam and ham, is the most important task. For that reason, do NOT classified spam posted by a Sockenpuppe. Restrict the training to SPAM written by bots. > > > I think, especially the different languages are a challenge for the script and the forum operators. > > If the forum is operated in e.g. German language and spam entries are only in English, it will be quite easy to detect the spam (my opinion). So, I don't think, that one can give a more general answer to this topic. > > > > often overlapping with the languages of the valid entries? > > THAT is the challenge which is (hopefully) solved by Bayes statistics ;-) > > > > How can we provide a dataset of training data for the forum operators (in the light of different languages), so they have not to start at the point 0? > > This point is discussed in the B8 documentation. In the end, it make not sense to provide such a database because of the different languages. A Russia forum does not benefit from a German or English database. > > /Micha
E-mail notification on reply of this posting
OK - Submit
Preview