CrimsonDB
Cosine: A Cloud-Cost Optimized NoSQL Storage Engine

Welcome to this demonstration where we present a self-designing key-value storage engine, Cosine, which can always take the shape of the close to “perfect” engine architecture given an input workload, a cloud budget, a target performance, and required cloud SLAs. By identifying and formalizing the first principles of storage engine layouts and core key-value algorithms, Cosine constructs a massive design space comprising of sextillion (10^36) possible storage engine designs over a diverse space of hardware and cloud pricing policies for three cloud providers – AWS, GCP, and Azure. Cosine spans across diverse designs such as Log-Structured Merge-trees, B-trees, Log-Structured Hash-tables, in-memory accelerators for filters and indexes as well as trillions of hybrid designs that do not appear in the literature or industry but emerge as valid combinations of the above.

At its core, Cosine includes a unified distribution-aware I/O model and a learned concurrency-aware CPU model that with high accuracy can calculate the performance and cloud cost of any possible design on any workload and virtual machines. Cosine can then search through that space in a matter of seconds to find the best design.
Try Out Cosine
—  Let Cosine design the best storage engine for you,
and decide which cloud provider and VMs you should use.
Cloud Provider
+
Hardware
+
Data Structure
Design Steps
1
Set inputs
2
Set SLA
3
Click the button
to continue
Service Level Agreement
Parameters
Requirements
Description
This is offered as a cloud service that you can purchase on top of the core storage and computing resources. You can subscribe to this service to migrate your data from one VM type to another as needed once per month.
This constitutes additional software solutions that you can deploy on top of your core data store. This includes building and testing, automated code-deploy, version control, and custom analytics.
This service includes a monthly backup of your storage charged on a per-GB basis.
This refers to the promised percentage of time during which the the VMs of an application are promised to be up and running. The values specified to the left indicate the availability requirements in terms of monthly uptime percentage. For all providers that do not meet your availability requirements, we will exclude them from the optimal configurations.
This parameter refers to the durability guarantees measured as the number of 9’s after the decimal of 99. For all providers that do not meet your durability requirements, we will exclude them from the optimal configurations.
Inputs  (The mandatory inputs are indicated by *)
data*
Total data items to store# Entries
Key-Value pair size in bytes (e.g., 16).Entry size (bytes)
Key size in bytes (e.g., 8).Key size (bytes)
You can manually enter data specifications or alternatively upload a data file. A sample file is here.

0%
workload*
No. of operations in the workloadNumber of queries
Proportion of Lookup OperationsLookups %
Proportion of Existing Point Lookup OperationPoint Lookups %
Proportion of Non-result Point Lookup OperationZero result Point Lookups %
Proportion of Write OperationsWrites %
Proportion of Inserts in the Workload Inserts %
Proportion of Blind Updates in the Workload Blind Updates %
Proportion of Read Modify Updates in the Workload Read Modify Updates %
Proportion of Range Query OperationsRange Queries %
Non-Empty Range Lookup %
Empty Range Lookup %
Target Range Size
You can manually enter workload specifications or alternatively upload a workload file. A sample file is here.

0%
budget*
The total amount of money you are willing to spend on the cloud to run this workload.Budget
performance
The target performance in terms of latency that you want to achieve for this workload.Latency (hours)
cloud


Design Storage Engine
0%
Top           %  cloud provider
           cheapest budget
Break-up of I/O cost
Data structure participation
Top           performance
Discovery of hybrid designs
Performance improvement over existing engines
Cost coverage
Latency coverage