AI security and retrieval products designed from the ground up for LLM-powered systems — built internally and available to organizations that need the same leverage.
AI security layer that protects LLMs with semantic fencing. Validates prompts in real time, prevents prompt injection, and ensures strict contextual compliance. Start a project from a topic, a URL, or your own documents — CrocoTiger generates the rules.
Define your semantic fence from a topic (auto-generate rules), a URL (scrape context and constraints), or your own documents (file-based grounding).
Full visibility into your security model's performance: execution logs, a sample of the synthetic training data generated, and attack metrics against common vectors.
Battle-test your defenses with adversarial patterns for Prompt Injection, Jailbreaks, Red Teaming, Policy Evasion, and Obfuscation — using Garak, promptfoo, IBM Research AttQ, and more.
Real-time visibility into the model construction process: topic datasets generation, topics restriction, attack dataset generation, experiment data creation, autotuning, and benchmarking.
Test the built project directly — submit any prompt and receive an acceptance or refusal response based on your project's configuration.
Connect your OpenAI and Gemini API keys. Gain full control over usage limits, pricing tiers, and project-specific configurations.
Security auditing library that runs automated adversarial attacks against AI systems — LLMs and semantic fences alike. Coordinates four red-teaming frameworks, routes attack prompts to your target via HTTP or a custom handler, and produces reports in PDF, HTML, CSV, and Markdown.
pip install crocotester
50+ probe categories: DAN jailbreaks, known-bad signatures, prompt injections, and encoding attacks. Broad coverage for both LLMs and semantic fences.
Single-turn seeds plus multi-turn strategies: Crescendo, Red Teaming, and Tree of Attacks with Pruning. Designed for systematic adversarial exploration.
StrongREJECT, B3, Fortress, AgentHarm, and AgentDojo benchmarks. Covers safety evaluation across agentic and non-agentic AI systems.
Config-driven red team probing with customizable test suites. Flexible targeting for any AI endpoint via curl commands or custom handlers.
$PROMPT placeholderLLM or SEMANTIC_FENCE--max-attacksHybrid, high-dimensional similarity engine built for RAG from the ground up. Combines keyword and vector search to return exact results — not approximations — so LLMs can respond consistently regardless of query phrasing.
Blends keyword search (BM25) and vector search into a single query pipeline. Embeddings alone can miss important results — combining both methods ensures LLMs receive complete and consistent context.
Optimized for embeddings of 1024 dimensions or more. The top 20 models on the MTEB leaderboard average 3,900 dimensions — sim_LAR is built to handle that space efficiently without sacrificing retrieval accuracy.
Creating and updating an index quickly is critical to keeping data as fresh as possible. sim_LAR creates and updates indexes 32–99× faster than competing approaches, enabling real-time ingestion for RAG pipelines.
Traditional similarity engines trade accuracy for speed, returning approximate results. In RAG, approximate is not enough — LLMs need exact results to respond consistently. sim_LAR focuses on perfect retrieval: the only engine to achieve 100% retrieval rate on HotpotQA.
Benchmarked against business-oriented similarity engines that support hybrid search. Max Position metric measures the highest rank of a relevant chunk across all queries — lower is better.