cai-exos-systems/daveadmin-exos-demo:exosneeds/pricing.md

gitea 756 words Source ↗

GPU Development Lane Start here. Do not buy hardware yet. | Provider | Recommended Start | Why | Indicative Price | | --- | --- | --- | --- | | SimplePod-style GPU rental | RTX 4090, 24 GB VRAM | Best first lane for Qwen 7B/14B demo development; start/stop hourly. | SimplePod blog quotes RTX 4090 around $0.23/hr and RTX 3060 around $0.05/hr. | | RunPod | RTX 4090 or A100 | Better-known alternative with broader ecosystem; good second provider for repeatable demos. | Public GPU pricing trackers commonly show RTX 4090 around $0.34/hr, but check RunPod directly. | | Vast.ai | RTX 4090 marketplace | Cheapest experiments when interruption is acceptable. | Public trackers show roughly $0.27-$0.40/hr for RTX 4090 depending on host and supply. | Recommended EXOS start: - Qwen 7B on RTX 4090 - one request at a time - no fine-tuning - no RAG for the first baseline - add controlled evidence after the baseline works Avoid RTX 3060 for the main demo unless cost is the only concern. It can run smaller quantized models, but the 12 GB VRAM ceiling will constrain context, throughput, and 14B experiments.

VPS Control Plane The VPS is the always-on demo/development server. The GPU is rented separately. | Provider | Starting Size | Use | Indicative Price | | --- | --- | --- | --- | | Hetzner | CPX31/CX42-class, 4-8 vCPU, 8-16 GB RAM, 160 GB SSD/NVMe | Recommended starting VPS because EXOS/BNL already think in EU infrastructure and it is cost-effective. | Current third-party trackers show CPX31-class around $17-$25/mo depending region; Hetzner CX/CPX pricing must be confirmed in console. | | DigitalOcean | Basic 4 vCPU / 8 GB / 160 GB | Predictable UI, simple billing, easy for developers. | Official table lists 4 vCPU / 8 GB Basic at $48/mo; 2 vCPU / 4 GB at $24/mo. | | Vultr | Cloud Compute 4 vCPU / 8 GB / 180 GB | Similar global VPS competitor with simple hourly/monthly pricing. | Official table lists 4 vCPU / 8 GB at $48/mo; 2 vCPU / 4 GB at $24/mo. | Recommended EXOS VPS: - minimum: 4 vCPU, 8 GB RAM, 160 GB SSD - better: 8 vCPU, 16 GB RAM, 240-320 GB SSD - OS: Ubuntu 24.04 LTS - stack: Docker Compose, PHP, database, Qdrant, Directus, FOSSBilling, OPA, observability

Azure Target Sizing Azure is the enterprise landing zone, not the cheapest development lane. | Azure Role | Starting Option | Notes | | --- | --- | --- | | Web/API app | Azure App Service, Container Apps, or AKS | Container Apps is a good middle path before AKS complexity. | | Database | Azure Database for PostgreSQL/MySQL | Use managed backups, private networking, and Key Vault secrets. | | Vector/evidence | Qdrant in Container Apps/AKS or Azure AI Search | Qdrant preserves the current pattern; Azure AI Search is easier to govern in Microsoft estates. | | Observability | Application Insights + Azure Monitor | Replacement for Langfuse-style trace review can be partial at first. | | Workflow | Logic Apps / Power Automate | Replacement path for n8n workflows. | | Agent studio | Copilot Studio / Azure AI Foundry | Enterprise path for role-aligned EXOS agents. | | GPU inference | NCasT4_v3 for light inference; NC A100/H100 families for heavier workloads | Azure GPU quotas must be requested early. | Azure GPU reference: - `Standard_NC4as_T4_v3`: 4 vCPU, 28 GB RAM, 1 NVIDIA T4 with 16 GB VRAM. - T4 is acceptable for light Qwen 7B demos, but slower than RTX 4090. - A100/H100 classes are better for higher throughput, larger context, and production proof.

Budget Guidance Low-cost development month: - Hetzner-class VPS: roughly $18-$35/mo - Hourly RTX 4090: 20 development hours at $0.23-$0.40/hr = $4.60-$8.00 - Extra snapshots/backups/domain/email: allow $10-$25 - Practical total: under $75/mo if GPU is used only when needed Client-demo month with more GPU use: - VPS: $18-$50/mo - RTX 4090: 80 hours at $0.23-$0.40/hr = $18.40-$32.00 - Observability/storage/backups: $20-$50 - Practical total: $60-$130/mo before Azure services Azure production proof: - Expect materially higher cost. - Use Azure Pricing Calculator and Retail Prices API for exact region/SKU pricing. - Request GPU quota before committing dates.

Sources - EXOS public positioning: https://www.exos-systems.com/ - SimplePod GPU pricing blog: https://simplepod.ai/blog/cloud-gpu-pricing/ - SimplePod rental rationale: https://simplepod.ai/blog/why-renting-cloud-gpus-is-better-than-buying/ - DigitalOcean Droplet pricing: https://www.digitalocean.com/pricing/droplets - DigitalOcean GPU pricing docs: https://docs.digitalocean.com/products/droplets/details/pricing/ - Vultr pricing: https://www.vultr.com/pricing/ - Hetzner regular performance cloud: https://www.hetzner.com/cloud/regular-performance - Hetzner price adjustment note: https://docs.hetzner.com/general/infrastructure-and-availability/price-adjustment/ - Azure NCasT4_v3 specs: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/ncast4v3-series - Azure NC family overview: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/nc-family - Azure Retail Prices API: https://learn.microsoft.com/en-us/rest/api/cost-management/retail-prices/azure-retail-prices ```