Blocking Malicious Bots With Centralized Machine Learning

2 min read time
Blocking Malicious Bots With Centralized Machine Learning

About a quarter of all web traffic comes not from human users but from automated bots – some good, some bad.

Bots can be broken into three categories:

  1. Clearly good bots. For example, GoogleBot, which is critical for the search engine to index your website.
  2. Clearly bad bots. For example, crawling sites to collect emails for spamming, executing basic denial of service attacks, scanning for vulnerabilities, etc.
  3. Bad bots that pretend to be human, to avoid detection and bypass protections. These are always hostile and typically involved in credential stuffing, price scraping, content theft, and more.

Of the three categories, only category two (clearly bad bots) is easy to deal with using standard security systems.

To handle these, you can rate-limit traffic to your site, block bad referrers, and block known bad user agents. You could also employ a reputation-based IP blocking system like NovaSense, which reduces >95% of automated browsing.

The real challenge, though, is to stop category three bots without stopping category one bots to allow Google bots and other automated systems that behave correctly, while detecting and blocking advanced bots using tools like headless Chrome and pretending to be a real user.

Remember that it can be more costly for a business today to block a search engine spider and be de-listed than to be the victim of an actual attack. At the same time, a real web browser that can run JavaScript, wait between pages, emulate clicking a mouse, and more can be very hard to identify as a bot.

This is where Nova's latest AI-powered Bot Protection system comes into play. Nova can allow legitimate automated traffic (like the above-mentioned GoogleBot) while detecting and blocking malicious automated traffic at scale.


Nova Nodes detect suspicious behavior automatically and locally block most automated or dangerous traffic. Snapt has now added the ability to submit a questionable (but unclear) browser to Nova's centralized machine learning (ML) system, which is highly effective and accurate at detecting natural browsing patterns, the identifiers associated with bot traffic, and networks' reputations.

Nova's centralized intelligence then informs the Node what action to take and allows it to confidently block unwanted bot traffic with extremely high accuracy.

Snapt has spent years collecting billions of blocked web requests, analyzing browsers through our network of thousands of web application firewalls (WAFs), and training our AI with an extensive dataset. The Nova Bot AI is capable of accurately identifying modern and legacy automated browsing and allowing search engine spiders and legitimate consumers.

Nova Bot Protection is included as a part of the full Nova WAF solution, ensuring that you are protected against all manner of threats and denial of service attempts against your business-critical services.

Subscribe via Email

Get daily blog updates straight to your email inbox.

You have successfully been subscribed!