[HN Gopher] BotOrNot (2021)
___________________________________________________________________
 
BotOrNot (2021)
 
Author : peter_d_sherman
Score  : 43 points
Date   : 2023-09-15 09:23 UTC (13 hours ago)
 
web link (incolumitas.com)
w3m dump (incolumitas.com)
 
| didntcheck wrote:
| Well I get a 403 when trying to load the detection page (in
| Firefox) from behind a VPN. Guess that's a fail then?
 
| mateuszbuda wrote:
| Some parts are outdated now. Bot detection vs web scraping is an
| ever evolving arms race.
| 
| Anyway, incolumnitas blog is a great source of information about
| bot detection and techniques to avoid being detected. We have
| implemented many successful methods based on it in
| https://scrapingfish.com/.
 
| thevania wrote:
| link to bot detection page - https://bot.incolumitas.com/ - is
| down
 
  | Forbo wrote:
  | I just get a 403. I assumed it was because of my VPN but even
  | without it I get the same result.
 
  | 1vuio0pswjnm7 wrote:
  | Not down for me.                  echo 167.99.241.135
  | bot.incolumitas.com|sed -i -e1r/dev/stdin -e1N /etc/hosts
  | FTPUSERAGENT= tnftp -4vdo/dev/stdout
  | https://bot.incolumitas.com        links -no-connect
  | https://bot.incolumitas.com        sed -i 1d /etc/hosts
  | curl --resolve *:443:167.99.241.135 https://bot.incolumitas.com
 
  | [deleted]
 
  | 1vuio0pswjnm7 wrote:
  | False. It's up.
  | 
  | Alas, does not accept TLS1.3.                    printf 'HEAD /
  | HTTP/1.0\r\nhost: bot.incolumitas.com\r\nconnection:
  | close\r\n\r\n'|openssl s_client -connect 167.99.241.135:443
  | -ign_eof          printf 'GET / HTTP/1.0\r\nhost:
  | bot.incolumitas.com\r\nconnection: close\r\n\r\n'|openssl
  | s_client -connect 167.99.241.135:443 -ign_eof
  | echo|openssl s_client -connect 167.99.241.135:443
  | -showcerts|openssl x509 -text
 
  | risingson wrote:
  | It's been down for months, sadly. I was working on some dummy
  | Puppeteer human-behavior script back then and the site helped a
  | lot. Hope the guy's doing alright :/
 
| luth0r wrote:
| Kind reminder Google broke almost all TCP fingerprints (i.e. JA3)
| when introducing grease values and other stuff such as random
| order in the ciphers in Chromium. I work in a bot detection
| company and we put a lot effort on it, currently have an advanced
| implementation that ignores some extensions, order the fields,
| remove the grease values etc.
 
  | acheong08 wrote:
  | I have an open source implementation of normalized JA3
  | fingerprinting at https://github.com/fidraC/canary (Includes
  | CreepJS and other stuff to be added / WIP) it's not very
  | complicated but I did waste some time before realizing that
  | grease was everywhere, not just extensions.
  | 
  | Note: That is my secondary account for school. I'm not stealing
  | someone else's work. I use a different name as a sort of mental
  | compartmentalization between different spheres of my life.
 
| batch12 wrote:
| Ugh, just remembered hotornot circa 2004. Those neurons haven't
| been exercised in a while...
 
  | [deleted]
 
___________________________________________________________________
(page generated 2023-09-15 23:01 UTC)