Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • A arachni
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 125
    • Issues 125
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 8
    • Merge requests 8
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Arachni - Web Application Security Scanner Framework
  • arachni
  • Issues
  • #697
Closed
Open
Issue created Apr 14, 2016 by Administrator@rootContributor0 of 8 checklist items completed0/8 checklist items

Overhaul the multi-process scan architecture

Created by: Zapotek

Remove the current multi-process scan code (that nobody uses anyway) and replace it with generic, all-in-one worker processes.

Architecture

Architecture should be similar to the BrowserCluster but with processes instead of threads.

  • Use method(:my_handler) callbacks rather than proc{}s to help out the GC.
    • proc closures retain their env and we'll need to store a lot of callbacks.
  • Take advantage of copy-on-write so preload as much data as possible prior to forking.
  • Use Arachni::RPC for communication.
    • Use UNIX sockets when available, otherwise TCP/IP.
    • Disable SSL.
    • Disable compression.
  • Maybe have Dispatchers expose workers.
    • This will allow multiple machines to share one scan's workload when setup in Grid mode.
    • Similar to the existing multi-process system but much more efficient.
  • Should auto-scale by using #695 (closed).

Responsibilities

The workers should perform actions like:

  • HTML/XML parsing.
    • Can cause 100% usage when parsing very large documents, thus blocking the scan.
    • The Trainer will massively benefit from this, since it does a lot of parsing during page audits.
    • Should also perform the subsequent handling of the parsed document and send back the result instead of sending back the parsed document, otherwise there's no point to it.
  • Arachni::Support::Signature processing.
    • Signature generation, refinement and matching can cause 100% CPU usage when dealing with very large data sets, thus blocking the scan.
  • Manage browser processes.
    • The system is already launching Ruby life-line processes to ensure that PhantomJS processes don't zombie-out if the parent process disappears for whatever reason.
    • Since we're gonna have the workers, let them deal with that as well to keep the amount of overall processes to a minimum.
Assignee
Assign to
Time tracking