Models
Harpocrates offers a range of confidential AI models optimized for secure inference inside TEE enclaves.
Language Models
llm-secure-7b
RecommendedGeneral-purpose language model optimized for confidential inference. 7B parameters, suitable for most text generation tasks.
llm-secure-13b
Larger model with improved reasoning and instruction-following capabilities. Best for complex tasks requiring deeper understanding.
llm-secure-fast
Optimized for low-latency inference. Smaller model (3B) with faster response times, ideal for real-time applications.
Embedding Models
embed-secure-base
Generate 768-dimensional embeddings for semantic search, clustering, and retrieval tasks while maintaining data confidentiality.
embed-secure-large
Higher-dimensional embeddings (1536d) for improved accuracy in semantic search and similarity tasks.
How to Choose a Model
Select the right model based on your use case:
- General Use:Start with
llm-secure-7bfor the best balance of quality and cost - Complex Tasks:Use
llm-secure-13bfor reasoning, analysis, or long-form content - Real-Time:Choose
llm-secure-fastfor low-latency applications like chatbots - Search/RAG:Use
embed-secure-basefor semantic embeddings
Model-Specific Parameters
All language models support these parameters:
temperature0.0 - 2.0, default: 0.7Controls randomness. Lower values make output more focused and deterministic.
max_tokens1 - context_lengthMaximum number of tokens to generate in the response.
top_p0.0 - 1.0, default: 0.9Nucleus sampling parameter. Alternative to temperature for controlling randomness.
stoparray of stringsSequences where the model will stop generating tokens.