Tanzanite Silicon Solutions Inc. is a Semiconductor development company based in California, a leader in the development of Compute Express LinkTM (CXLTM) based products.
Tanzanite’s visionary Tanzanoid TZ architecture and purpose-built design of a “Smart Logic Interface Connector” (SLIC TZ) SoC enables independent scaling and sharing of memory and compute in a pool with low latency within and across server racks. The Tanzanite solution provides a highly scalable architecture for exa-scale level memory capacity and compute acceleration, supporting multiple industry-standard form-factors, ranging from E1.S, E3.S, memory expansion board, and memory appliance.
Tanzanite’s scalable, low latency CXL optimized fabric with caching, advanced security, and RAS features enable a multitude of use cases:
Amazon VPC, Amazon S3, Amazon EC2, EC2 Auto Scaling Groups, Amazon CloudWatch, Amazon CloudFormation, Amazon Spot Instances, OpenPBS, Amazon AD Connector with Azure Active Directory, AWS Parallel Cluster (Downgrading), Amazon OpenSearch Service, Terraform, Packer, Python Scripting, OpenVPN.
Tanzanite Silicon Solutions is a company that delivers SoC designs from concept to volume production. They’re using Amazon Web Services to run simulations and batch jobs to prepare and create their products. Tanzanite Silicon Solutions engaged DinoCloud to assess their product architecture and DevOps processes in order to provide qualified services to them in this field.
In the first stage, the expected result was to achieve the production implementation of the solution based on SOCA so that the client could migrate their legacy environment and run jobs on-demand using spot instances. In this way, Tanzanite Silicon Solutions would be able to manage, scale, interrupt and request simulations for what is Electronics Design Automation, using the SOCA orchestration as the main entity.
In turn, it will monitor the results of each of its jobs using Amazon OpenSearch and Tanzanite Silicon Solutions would manage the identities of its users with LDAP and Active Directory. In this way, Tanzanite Silicon Solutions would be able to go from the simulation instances (Stone, Silver, Gold, Wind down) seamlessly and be able to scale in their use of the platform with the same architecture.
In the first place, we decided to assess the whole environment and deployment of the platform, identifying risks and pain points of improvement. During the review, we involved a lot of knowledge in the AWS stack and the automation tools for infrastructure and CI/CD for the new architecture.
On the other hand, we prepared ourselves with the AWS team about the semiconductors space. We needed to gather knowledge about the processes and business in the industry to provide the best possible solution. We received training in the Tanzanite SI business model and in the AWS offering for the semiconductor space, including AWS Parallel Cluster and AWS SOCA.
In these first meetings, we understood that Tanzanite SI needed to move from AWS ParallelCluster (its actual implementation) to AWS SOCA, and that was the biggest epic during the engagement.
Once the main epics were identified, we set priority remediation objectives for the business, which were:
Once the project started, we decided to work with the Tanzanite SI team, using the Kanban methodology through synergistic communication between the DinoCloud team of architects and project managers, and the Tanzanite SI technical leaders.
Before DinoCloud engagement, Tanzanite SI was using AWS ParallelCluster as its main job orchestrator. They had only one production (and experimental) environment which had some issues in the upgrade and update of the ecosystem.
During the first steps of this project, we coordinated a meeting with AWS and we convey that the best solution for the company was to migrate to AWS SOCA stack, a more EDA-oriented architecture which was the key to solving many of the scalation and implementation issues they were facing.
Before starting the migration process, we assessed the entire stack and automation tools they were using. We reviewed their Terraform/Packer implementation in order to gain more quality and robustness in their solution.
In the middle of the SOCA implementation, we needed to provide support in the actual job simulations they were running. We faced both verticals in order to keep the business going meanwhile we started deploying our first proof of concept with SOCA. Some of these improvements (like the VPN deployment and the IaC upgrades) were important for the SOCA solution too.
Keeping the legacy environment was really important to maintain the business evolution. Tanzanite SI needed to continue making progress to go public with their solution, while they also continue testing the new architecture.
Until SOCA implementation gets robust and stable, it is important for the company to keep both environments running.
A production-like environment ready to be tested.
We achieved the deployment of the SOCA solution in order to start simulating jobs in this architecture. We prepared a robust environment for the company to grow its simulations for the core business.
We eased the cloud infrastructure management through automation tools and out-of-the-box stacks to create, destroy and upgrade the actual environment of the company.
We strengthened the actual AWS ParallelCluster environment in order to keep running present simulations in the company.
Other tasks performed: