managing-tools.mdx 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512
  1. ---
  2. title: "Managing tools"
  3. description: "This guide explains how to create a new tool in the Flowsint ecosystem. Tools are low-level wrappers around external utilities, Docker containers, and APIs that enrichers use to gather intelligence. Understanding the tool architecture will help you extend Flowsint's capabilities with new data sources and reconnaissance utilities."
  4. category: "Developers"
  5. order: 9
  6. author: "Flowsint Team"
  7. tags: ["tutorial", "developers", "creating-a-new-tool"]
  8. version: "1.2.8"
  9. last_updated_at: "2026-05-15"
  10. ---
  11. ## Understanding tools
  12. Tools in Flowsint serve as abstraction layers between enrichers and external systems. They provide a consistent interface for executing Docker containers, calling APIs, or running python libraries. While enrichers handle high-level orchestration and graph database operations, tools focus exclusively on executing external commands and returning raw results.
  13. Every tool implements a basic interface with methods for naming, categorization, and execution. Tools don't know anything about Pydantic types, Neo4j graphs, or the broader Flowsint architecture. They just wrap external functionality and return data.
  14. Flowsint currently includes tools for subdomain enumeration, port scanning, DNS queries, WHOIS lookups, web crawling, and business intelligence.
  15. ## Tool architecture
  16. The tool system has a two-tier inheritance structure. At the base level, you have the abstract `Tool` class that defines the interface every tool must implement. For tools that run in Docker containers, there's an intermediate `DockerTool` class that handles all the container lifecycle management.
  17. ### The Tool base class
  18. Every tool inherits from the abstract `Tool` class, which lives at `flowsint-enrichers/src/tools/base.py`. Here's what it looks like:
  19. ```python
  20. from abc import ABC, abstractmethod
  21. from typing import Any
  22. class Tool(ABC):
  23. """Abstract base class for all tools."""
  24. @classmethod
  25. @abstractmethod
  26. def name(cls) -> str:
  27. """Return the tool name."""
  28. pass
  29. @classmethod
  30. @abstractmethod
  31. def category(cls) -> str:
  32. """Return the tool category."""
  33. pass
  34. @classmethod
  35. @abstractmethod
  36. def description(cls) -> str:
  37. """Return a description of what the tool does."""
  38. pass
  39. @classmethod
  40. @abstractmethod
  41. def version(cls) -> str:
  42. """Return the tool version."""
  43. pass
  44. @abstractmethod
  45. def launch(self, value: str, *args, **kwargs) -> Any:
  46. """Execute the tool and return results."""
  47. pass
  48. ```
  49. Any tool you create must implement these five methods. The first four are class methods that provide metadata about the tool. The `launch` method is where the actual work happens.
  50. ### The DockerTool class
  51. Most security and reconnaissance tools run in Docker containers for isolation and portability. The `DockerTool` class at `flowsint-enrichers/src/tools/dockertool.py` provides all the infrastructure for running containerized tools.
  52. When you inherit from `DockerTool`, you get automatic image management, container execution, volume mounting, environment variable handling, and cleanup. You just specify the Docker image name and implement how to construct the command.
  53. Here's a simplified view of what `DockerTool` provides:
  54. ```python
  55. class DockerTool(Tool):
  56. """Base class for tools that run in Docker containers."""
  57. def __init__(self, image: str, default_tag: str = "latest"):
  58. """Initialize with Docker image information."""
  59. self.image = image
  60. self.default_tag = default_tag
  61. self.docker_client = docker.from_env()
  62. def install(self) -> None:
  63. """Pull the Docker image if not already present."""
  64. # Pulls image from Docker Hub
  65. pass
  66. def is_installed(self) -> bool:
  67. """Check if the Docker image exists locally."""
  68. # Checks local images
  69. pass
  70. def launch(self, command: str, volumes: dict = None,
  71. timeout: int = 30, environment: dict = None) -> Any:
  72. """Run a command in the Docker container."""
  73. # Executes container and returns output
  74. pass
  75. ```
  76. The `launch` method in `DockerTool` handles container execution. It sets up the environment, mounts volumes if needed, runs the container, captures output, and cleans up afterward.
  77. ## Creating a simple API-based tool
  78. Let's start with the simpler case of creating a tool that calls an external API. We'll create a hypothetical tool for querying a threat intelligence service.
  79. ### File structure
  80. Create a new python file in the appropriate category directory under `flowsint-enrichers/src/tools/`. For a security-related tool, you might use `tools/security/`:
  81. ```bash
  82. cd flowsint-enrichers/src/tools/security/
  83. touch threat_intel.py
  84. ```
  85. If the category directory doesn't exist, create it first and add an `__init__.py` file to make it a python package.
  86. ### Basic implementation
  87. Here's a complete example of an API-based tool:
  88. ```python
  89. from tools.base import Tool
  90. from typing import Any, Dict, List, Optional
  91. import requests
  92. class ThreatIntelTool(Tool):
  93. """Query threat intelligence data from an external API."""
  94. api_endpoint = "https://api.threatintel.example.com/v1"
  95. @classmethod
  96. def name(cls) -> str:
  97. """Return the tool name."""
  98. return "threatintel"
  99. @classmethod
  100. def category(cls) -> str:
  101. """Return the category this tool belongs to."""
  102. return "Threat Intelligence"
  103. @classmethod
  104. def description(cls) -> str:
  105. """Return a description of what this tool does."""
  106. return "Queries threat intelligence data for IPs, domains, and hashes"
  107. @classmethod
  108. def version(cls) -> str:
  109. """Return the tool version."""
  110. return "1.0.0"
  111. def launch(
  112. self,
  113. indicator: str,
  114. indicator_type: str = "ip",
  115. api_key: Optional[str] = None
  116. ) -> List[Dict[str, Any]]:
  117. """
  118. Query the threat intelligence API.
  119. Args:
  120. indicator: The indicator to query (IP, domain, hash, etc.)
  121. indicator_type: Type of indicator (ip, domain, hash)
  122. api_key: API key for authentication
  123. Returns:
  124. List of threat intelligence records
  125. """
  126. if not api_key:
  127. raise ValueError("API key is required")
  128. headers = {
  129. "Authorization": f"Bearer {api_key}",
  130. "Content-Type": "application/json"
  131. }
  132. params = {
  133. "indicator": indicator,
  134. "type": indicator_type
  135. }
  136. try:
  137. response = requests.get(
  138. f"{self.api_endpoint}/query",
  139. headers=headers,
  140. params=params,
  141. timeout=30
  142. )
  143. response.raise_for_status()
  144. return response.json().get("results", [])
  145. except requests.exceptions.RequestException as e:
  146. print(f"Error querying threat intel API: {e}")
  147. return []
  148. ```
  149. This tool follows a straightforward pattern. The class methods provide metadata that enrichers and the registry use. The `launch` method implements the actual API interaction, handling authentication, making the request, and returning structured data.
  150. Notice how the tool returns simple python data structures like lists and dictionaries. Tools don't know about Pydantic types or Flowsint models. That's the enricher's job.
  151. ## Creating a docker-based tool
  152. Docker-based tools are more common in Flowsint because most reconnaissance utilities need specific dependencies and isolated environments. Let's walk through creating a tool that wraps a hypothetical Docker-based subdomain scanner.
  153. ### Setting up the class
  154. Start by inheriting from `DockerTool` and providing the Docker image information:
  155. ```python
  156. from tools.dockertool import DockerTool
  157. from typing import List, Optional, Any
  158. class MySubdomainTool(DockerTool):
  159. """Wrapper for a Docker-based subdomain enumeration tool."""
  160. image = "org/subdomain-scanner"
  161. default_tag = "latest"
  162. def __init__(self):
  163. """Initialize the tool with Docker image information."""
  164. super().__init__(self.image, self.default_tag)
  165. ```
  166. The `image` and `default_tag` class attributes tell `DockerTool` which Docker image to use. When you instantiate the tool, it will automatically connect to the Docker daemon.
  167. ### Implementing the launch method
  168. The `launch` method needs to construct the command that runs inside the container and handle the results:
  169. ```python
  170. def launch(
  171. self,
  172. domain: str,
  173. timeout: int = 300,
  174. wordlist: Optional[str] = None
  175. ) -> List[str]:
  176. """
  177. Enumerate subdomains for a given domain.
  178. Args:
  179. domain: Target domain to enumerate
  180. timeout: Maximum execution time in seconds
  181. wordlist: Optional path to custom wordlist file
  182. Returns:
  183. List of discovered subdomain strings
  184. """
  185. # Ensure the Docker image is available
  186. if not self.is_installed():
  187. self.install()
  188. # Build the command that runs inside the container
  189. command = f"-d {domain}"
  190. if wordlist:
  191. command += f" -w {wordlist}"
  192. # Add JSON output flag for easier parsing
  193. command += " -json"
  194. # Execute the container
  195. try:
  196. result = super().launch(
  197. command=command,
  198. timeout=timeout
  199. )
  200. # Parse the output
  201. subdomains = self._parse_output(result)
  202. return subdomains
  203. except Exception as e:
  204. print(f"Error running subdomain scanner: {e}")
  205. return []
  206. def _parse_output(self, output: str) -> List[str]:
  207. """Parse the tool output and extract subdomains."""
  208. import json
  209. subdomains = []
  210. for line in output.strip().split('\n'):
  211. if not line:
  212. continue
  213. try:
  214. data = json.loads(line)
  215. if 'subdomain' in data:
  216. subdomains.append(data['subdomain'])
  217. except json.JSONDecodeError:
  218. continue
  219. return list(set(subdomains)) # Remove duplicates
  220. ```
  221. This implementation shows several important patterns. First, it checks if the Docker image is installed and pulls it if necessary. Second, it constructs the command string that will run inside the container. Third, it calls the parent class's `launch` method to handle the actual container execution. Finally, it parses the output into a clean python data structure.
  222. ### Handling volumes
  223. Some tools need access to files on the host system. You can mount volumes when calling the parent's `launch` method:
  224. ```python
  225. def launch(self, domain: str, wordlist_path: str = None) -> List[str]:
  226. """Run the tool with optional wordlist file."""
  227. command = f"-d {domain}"
  228. volumes = None
  229. if wordlist_path:
  230. # Mount the wordlist file into the container
  231. volumes = {
  232. wordlist_path: {
  233. 'bind': '/wordlist.txt',
  234. 'mode': 'ro' # read-only
  235. }
  236. }
  237. command += " -w /wordlist.txt"
  238. result = super().launch(
  239. command=command,
  240. volumes=volumes
  241. )
  242. return self._parse_output(result)
  243. ```
  244. The volumes dictionary maps host paths to container paths. You can specify the mount mode as 'ro' for read-only or 'rw' for read-write.
  245. ### Using environment variables
  246. For tools that need API keys or configuration through environment variables:
  247. ```python
  248. def launch(self, domain: str, api_key: Optional[str] = None) -> List[str]:
  249. """Run the tool with optional API key for enhanced scanning."""
  250. command = f"-d {domain}"
  251. environment = {}
  252. if api_key:
  253. environment['API_KEY'] = api_key
  254. result = super().launch(
  255. command=command,
  256. environment=environment
  257. )
  258. return self._parse_output(result)
  259. ```
  260. ## Testing your tool
  261. Creating tests for your tool helps ensure it works correctly and makes it easier to catch regressions. Create a test file in `flowsint-enrichers/tests/tools/` that mirrors your tool's location:
  262. ```python
  263. # tests/tools/security/test_threat_intel.py
  264. from tools.security.threat_intel import ThreatIntelTool
  265. import pytest
  266. def test_tool_metadata():
  267. """Test that tool metadata is correctly defined."""
  268. assert ThreatIntelTool.name() == "threatintel"
  269. assert ThreatIntelTool.category() == "Threat Intelligence"
  270. assert "threat" in ThreatIntelTool.description().lower()
  271. def test_tool_launch_requires_api_key():
  272. """Test that launch method requires an API key."""
  273. tool = ThreatIntelTool()
  274. with pytest.raises(ValueError):
  275. tool.launch("192.0.2.1")
  276. def test_tool_launch_with_api_key(monkeypatch):
  277. """Test successful API query with mocked response."""
  278. tool = ThreatIntelTool()
  279. # Mock the requests.get call
  280. def mock_get(*args, **kwargs):
  281. class MockResponse:
  282. def raise_for_status(self):
  283. pass
  284. def json(self):
  285. return {"results": [{"indicator": "192.0.2.1", "threat_level": "high"}]}
  286. return MockResponse()
  287. monkeypatch.setattr("requests.get", mock_get)
  288. results = tool.launch("192.0.2.1", api_key="test_key")
  289. assert len(results) == 1
  290. assert results[0]["indicator"] == "192.0.2.1"
  291. ```
  292. For docker-based tools, your tests need Docker to be running:
  293. ```python
  294. # tests/tools/network/test_my_subdomain_tool.py
  295. from tools.network.my_subdomain_tool import MySubdomainTool
  296. import pytest
  297. @pytest.mark.docker
  298. def test_tool_install():
  299. """Test that the Docker image can be pulled."""
  300. tool = MySubdomainTool()
  301. tool.install()
  302. assert tool.is_installed()
  303. @pytest.mark.docker
  304. def test_tool_launch():
  305. """Test running the tool against a domain."""
  306. tool = MySubdomainTool()
  307. results = tool.launch("example.com")
  308. assert isinstance(results, list)
  309. ```
  310. The `@pytest.mark.docker` decorator helps you separate tests that require Docker from those that don't.
  311. ## Best practices
  312. When creating tools, focus on simplicity and single responsibility. Each tool should wrap exactly one external utility or API. Don't try to combine multiple data sources in a single tool. That's what enrichers are for.
  313. Always handle errors gracefully. Network requests fail, Docker containers crash, and APIs return unexpected data. Your tool should catch these errors, log them appropriately, and return empty results or raise clear exceptions rather than crashing.
  314. Return simple data structures from the `launch` method. Use lists, dictionaries, strings, and numbers. Don't return Pydantic models or other complex objects. Remember that tools are low-level utilities that enrichers build upon.
  315. For Docker tools, always check if the image is installed before running it. The pattern of checking `is_installed()` and calling `install()` if necessary ensures the tool works even on fresh installations.
  316. When parsing tool output, be defensive. External tools can return unexpected formats, partial results, or garbage data. Validate and clean the output before returning it. Use try-except blocks around parsing logic.
  317. Document your tool thoroughly. The docstrings and parameter descriptions help other developers understand how to use your tool. Future enrichers will rely on this documentation.
  318. ## Integrating your tool
  319. Unlike types, tools don't need to be explicitly registered in a central registry. Enrichers import and use them directly. When you create an enricher that uses your new tool, you simply import it:
  320. ```python
  321. # In an enricher file
  322. from tools.security.threat_intel import ThreatIntelTool
  323. class IpToThreatIntelEnricher(Enricher):
  324. async def scan(self, data: List[Ip]) -> List[ThreatReport]:
  325. tool = ThreatIntelTool()
  326. results = []
  327. for ip in data:
  328. intel = tool.launch(
  329. indicator=ip.address,
  330. indicator_type="ip",
  331. api_key=api_key
  332. )
  333. # Process results...
  334. return results
  335. ```
  336. The enricher instantiates your tool, calls its `launch` method with appropriate parameters, and processes the results into Flowsint types.
  337. ## Common patterns
  338. Several patterns appear frequently in Flowsint tools. Understanding these will help you write tools that fit naturally into the ecosystem.
  339. ### The install-check pattern
  340. Most Docker tools follow this pattern at the start of `launch`:
  341. ```python
  342. def launch(self, ...):
  343. if not self.is_installed():
  344. self.install()
  345. # Continue with execution
  346. ```
  347. This ensures the docker image is available before trying to run it.
  348. ### The command builder pattern
  349. Complex tools often build commands incrementally based on parameters:
  350. ```python
  351. def launch(self, target: str, mode: str = "fast", verbose: bool = False):
  352. command = f"-target {target}"
  353. if mode == "thorough":
  354. command += " --thorough"
  355. if verbose:
  356. command += " -v"
  357. result = super().launch(command)
  358. ```
  359. ### The output parser pattern
  360. Many tools separate execution from parsing:
  361. ```python
  362. def launch(self, ...):
  363. raw_output = super().launch(command)
  364. return self._parse_output(raw_output)
  365. def _parse_output(self, output: str) -> List[Dict]:
  366. """Parse raw tool output into structured data."""
  367. # Parsing logic here
  368. ```
  369. This separation makes the code easier to test and maintain.
  370. ## Next steps
  371. Once you've created your tool and tested it, you can build enrichers that use it. Enrichers orchestrate one or more tools to gather intelligence, validate the results, convert them to Flowsint types, and create graph database nodes and relationships.
  372. If your tool requires API keys or other secrets, enrichers can access them through the vault system. When you implement an enricher that uses your tool, you can define parameters of type `vaultSecret` that pull credentials from the user's encrypted vault.
  373. Remember that tools are just one layer in the Flowsint architecture. They provide the raw capabilities, but enrichers provide the intelligence and graph-building logic that makes the platform powerful.