- News & Resources: Listings >
- Blog
- How to Improve Safety and Security in Schools – Cloud Manage Network
- Top 10 Cybersecurity Threats in 2024
- Microsegmentation: Protecting Data from Cyber Threats
- Retail shoplifting and loss prevention: How to protect your business
- Generative AI Cost Optimization Strategies
- Why Do I Need to Protect My Cloud?
- 10 Reasons for Engaging Outside Experts to Manage Your Cybersecurity
- Why Hiring a 3rd Party MSP Expert Makes Sense and – and Cents (MANY cents!)
- Brand and Network Considerations When Adopting AI Corporately
- Integrating XDR, SIEM, and SOAR
- 3-2-1 –Go? Not so quick, this time.
- 5 Things a CISO Shoud Know
- 10-Step Patch Management Checklist
- Penetration Testing vs. Breach Attack Simulation
- Current big cyber breaches and impact on businesses
- Smart Infrastructure Gets Lit Up!
- Securing Industrial IoT: The Missing Puzzle Piece
- 7 Common Cybersecurity Mistakes Made by SMBs
- The Future of Physical Security: Cloud-Based Systems
- Autonomous and Sensor Technology Use Surging
- 2024 Facilities Trends Will Require Facilities and IT Teams to Work in Tandem
- NGFW vs. WAF. What’s the Right Firewall for You?
- Chris Hadfield’s Words To Live By
- Industrial Revolution 4.0 + IIoT
- Digital Fluency Drives Innovation
- Your Cloud Needs Protecting, Too
- Your building alarm systems could become obsolete. In 2024!
- Zero Trust 2.0: Zero Trust Data Resilience (ZTDR)
- We just got, or got used to, Wi-Fi 6. What is Wi-Fi 7?
- What Does the Board Need to Know? Business Metrics that CISOs Should Share – 4th and Last in a Four-Part Series
- Why 2024 is the Year for AI Networking
- International Women’s Day is Tomorrow – Great Time to Think About…
- Data-Centric Security Step One: Classifying Your Data
- The Network – Unsung Hero of Super Bowl LVIII
- What Does the Board Need to Know? Business Metrics that CISOs Should Share – Third in a Four-Part Series
- Boosting IT Team Performance by Fostering Intuition, Curiosity and Creativity
- Breach Remediation Costs Can Wipeout Bottom Line and Business
- Hoodied Hackers Now Favour Hugo Boss
- What Do You Need to Tell the Board? Business Metrics that CISOs Should Share – Second in a Four-Part Series
- How to Get People to Re-Engage After the Holidays
- What Does the Board Need to Know? Business Metrics that CISOs Should Share – First in a Four-Part Series
- Android Devices MUST be Updated + IT Departments Being Cut as Privilege Escalation Escalates
- Today’s Common Cloud Migration and Management Concerns
- Protect Your Healthcare Network from Cyberattack – Lives are at Stake
- Happy Halloween: Black Cats Lead to Boo….Hoo.
- Insurance Underwriters are Protecting Their Flanks
- Insurance Companies Cracking Down as Cybercriminals Become Better Business Builders
- Scary Cyberattacks Stats
- Parents, Profs and IT Professionals Perceive Back-to-School Through Different Lens
- Zscaler’s new IDTR and other tools that leverage generative AI
- Vanquish Vaping, Vandalism and Villainy
- Fabric for Fast-Paced Environments
- Changes to Cyber Insurance Requirements – What you Need to Know
- Cybersecurity Readiness – Newly Released Report
- Passwords Leaked…Again
- 10-Step Patch Management Checklist
- Remote – Again – For Now… and Still Maintaining Engagement
- Protecting Pocketbooks, Passwords and Property from Pilfering
- Raspberry Robin: Highly Evasive Worm Spreads over External Disks
- Cisco Introduces Responsible AI – Enhancing Technology, Transparency and Customer Trust
- Managing Customer Trust in Uncertain Supply Chain Conditions
- Hope on the Horizon
- Toys of Tomorrow… What will spark your imagination? Fuel your imagination?
- Protecting Purses and Digital Wallets
- The Password that Felled the Kingdom + MFA vs 2FA
- The MOE’s RA 3.0 and Zscaler
- 7 Critical Reasons for MS Office 365 Backup
- Penetration Testing Important, but…
- Social Engineering and Poor Patching Responsible for Over 90% of Cybersecurity Problems
- Breach Incidence and Costs On the Rise Again + 5 Ways to Reduce Your Risk
- Cybersecurity Insurance Policies Require Security Audits and Pen Testing
- Wireless strategies for business continuity gain importance as enterprise expand IoT, cloud, and other technologies
- How Cybercrooks are Targeting YOU
- Enabling Digital Transformation with Cisco SD-WAN
- WFH Post Pandemic – What It Will Look Like. What You’ll Need.
- Leaders to looking to the IoT to improve efficiency and resiliency
- Cyber Security Vernacular – Well, some of it, for now
- Why You Need Disaster Recovery, NOT Just Back-Ups
- 10 Reasons Why Having an Expert Manage Your Cybersecurity Makes Sense and Saves Dollars
- Converting CapEx IT Investments into Manageable OpEx
- The Hybrid Workplace – Planning the Next Phase
- Cisco Cloud Calling: Empowering Customers to Thrive with Hybrid Work
- When You Can’t Access the Cloud
- How to Keep On Keeping On
- New Cisco Research Reveals Collaboration, Cloud and Security are IT’s Top Challenges
- Threats from Within on the Rise
- Cloud Covered? If Not, Take Cover!
- Zero Trust and Forrester Wave Report
- Password Based Cyber Attack: Like Leaving Keys Under Doormats
- So, What’s Up With Sensors?
- Sensors and Systems Create a Digital “Last Mile” and Help Skyrocketing Costs
- Scanners Provide Peace of Mind for Returning Students and Workers
- Sensors Improve Operations and Bottom Line… Easily and Cost-Affordably.
- Cisco Meraki Looks at 2021
- 2020 Holiday Shopping: Cybersecurity and Other Tips to Safeguard Wallets and Systems
- How to make the most of the technology you have
- Personnel, Planet and Business Progress: More Interdependent Than Ever Before
- Sure… you can get them all in the boat – but can you get them to work well together?
- Pushing the Zero Trust Envelope – Cisco is Named a Leader in the 2020 Forrester Zero Trust Wave
- Cloud Data Must be Protected, Too!
- Don’t Let Anyone Get the Dirt on You – Make It Instead!
- How IoT Devices Can Help You and Your business
- WebEx – A World of Possibility
- Creating Your Breach Response Plan Now Will Save You Thousands Down The Road
- Been hacked? Here’s what you must do next.
- The Need for Pen Testing is At an All-Time High
- 5 Ways an IT Reseller Improves Your Performance and Peace-of-Mind
- 5G and Wi-Fi 6: Faster, more flexible, and future ready. Are you?
- Network and Data Security for Returning and Remote Workers + Disaster Recovery Symposium
- Collaboration and Cisco WebEx: Protecting Your Data
- Thursday’s Virtual Conference Tackles Today’s Supply Chain Trials and Tribulations
- 10 Tips to Reduce Cloud Storage Risk
- COVID-19 Crisis Fuelling IT Spending
- Supply Chain/Logistics Experts Share Their Expertise
- Cisco Breach Defence Overview
- Announcing Our New Website and Blog
One of today’s hot buttons is AI. It doesn’t matter if you are a student, retiree, starting your career or holding a C-suite title, how to use AI (or how it is being used in daily transactions) is on everyone’s mind. For companies dipping their toes in the AI well, the question of Cost Optimization is a part of the discussion, too.
On September 23, 2024, Matthias Patzak with AWS, one of our partners, wrote an interesting blog post on the topic, which we would like to share with you, below.
“Matthias joined the Enterprise Strategist team in early 2023 after a stint as a Principal Advisor in AWS Solutions Architecture. In this role, Matthias works with executive teams on how the cloud can help to increase the speed of innovation, the efficiency of their IT, and the business value their technology generates from a people, process and technology perspective. Before joining AWS, Matthias was Vice President IT at AutoScout24 and Managing Director at Home Shopping Europe. In both companies he introduced lean-agile operational models at scale and led successful cloud transformations resulting in shorter delivery times, increased business value and higher company valuations.”
His Generative AI Cost Optimization Strategies post:
“As an executive exploring generative AI’s potential for your organization, you’re likely concerned about costs. Implementing AI isn’t just about picking a model and letting it run. It’s a complex ecosystem of decisions, each affecting the final price tag. This post will guide you through optimizing costs throughout the AI life cycle, from model selection and fine-tuning to data management and operations.
Model Selection
Wouldn’t it be great to have a lightning-fast, highly accurate AI model that costs pennies to run? Since this ideal scenario does not exist (yet), you must find the optimal model for each use case by balancing performance, accuracy, and cost.
Start by clearly defining your use case and its requirements. These questions will guide your model selection:
- Who is the user?
- What is the task?
- What level of accuracy do you need?
- How critical is rapid response time to the user?
- What input types will your model need to handle, and what output types are expected?
Next, experiment with different model sizes and types. Smaller, more specialized models may lack the broad knowledge base of their larger counterparts, but they can be highly effective—and more economical—for specific tasks.
Consider a multi-model approach for complex use cases. Not all tasks in a use case may require the same level of model complexity. Use different models for different steps to improve performance while reducing costs.
Fine-Tuning and Model Customization
Pretrained foundation models (FMs) are publicly available and can be used by any company, including your competitors. While powerful, they lack the specific knowledge and context of your business.
To gain a competitive advantage, you need to infuse these generic models with your organization’s unique knowledge and data. Doing so transforms an FM into a powerful, customized tool that understands your industry, speaks your company’s language, and leverages your proprietary information. Your choice to use retrieval-augmented generation (RAG), fine-tuning, or prompt engineering for this customization will affect your costs.
Retrieval-Augmented Generation
RAG pulls data from your organization’s data sources to enrich user prompts so they deliver more relevant and accurate responses. Imagine your AI being able to instantly reference your product catalog or company policies as it generates responses. RAG improves accuracy and relevance without extensive model retraining, balancing performance and cost efficiency.
Fine-Tuning
Fine-tuning means training an FM on a additional, specialized data from your organization. It requires significant computational resources, machine learning expertise, and carefully prepared data, making it more expensive to implement and maintain than RAG.
Fine-tuning excels when you need the model to perform especially well on specific tasks, consistently produce outputs in a particular format, or perform complex operations beyond simple information retrieval.
I recommend a phased approach. Start with less resource-intensive methods such as RAG and consider fine-tuning only when these methods fail to meet your needs. Set clear performance benchmarks and regularly evaluate the gains versus the resources invested.
Prompt Engineering
Prompts are the instructions given to AI applications. AI users such as designers, marketers, or software developers enter prompts to generate the output they want, such as pictures, text summaries or source code. Prompt engineering is the practice of crafting and refining these instructions to get the best possible results. Think of it as asking the right questions to get the best answers.
Good prompts can significantly reduce costs. Clear, specific instructions reduce the need for multiple back-and-forth interactions that can quickly add up in pay-per-query pricing models. They also lead to more accurate responses, reducing the need for costly, time-consuming human review. With prompts that provide more context and guidance, you can often use smaller, more cost-effective AI models.
Data Management
The data you use to customize generic FMs is also a significant cost driver. Many organizations fall into the trap of thinking that more data always leads to better AI performance. In reality, a smaller dataset of high-quality, relevant data often outperforms larger, noisier datasets.
Investing in robust data cleansing and curation processes can reduce the complexity—and cost—of customizing and maintaining AI models. Clean, well-organized data allows for more efficient fine-tuning and produces more accurate results from techniques like RAG. It lets you streamline the customization process, improve model performance, and ultimately lower the ongoing costs of your AI implementations.
Strong data governance practices can help increase the accuracy and cost-performance of your customized FM. It should include proper data organization, versioning, and lineage tracking. On the other hand, inconsistently labeled, outdated, or duplicate data can cause your AI to produce inaccurate or inconsistent results, slowing performance and increasing operational costs. Good governance helps ensure regulatory compliance, preventing costly legal issues down the road.
Operations
Controlling AI costs isn’t just about technology and data—it’s about how your organization operates.
Organizational Culture and Practices
Foster a culture of cost-consciousness and frugality around AI, and train your employees in cost-optimization techniques. Share case studies of successful cost-saving initiatives and reward innovative ideas that lead to significant cost savings. Most importantly, encourage a prove-the-value approach for AI initiatives. Regularly communicate the financial impact of AI to stakeholders.
Continuous learning about AI developments helps your team identify new cost-saving opportunities. Encourage your team to test various AI models or data preprocessing techniques to find the most cost-effective solutions.
FinOps for AI
FinOps, short for financial operations, is a practice that brings financial accountability to the variable spend model of cloud computing. For AI initiatives, it can help your organization more efficiently use and manage resources for training, customizing, fine-tuning, and running your AI models. (Resources include cloud computing power, data storage, API calls, and specialized hardware like GPUs.) FinOps helps you forecast costs more accurately, make data-driven decisions about AI spending, and optimize resource usage across the AI life cycle.
FinOps balances a centralized organizational and technical platform that applies the core FinOps principles of visibility, optimization, and governance with responsible and capble decentralized teams. Each team should “own” its AI costs—making informed decisions about model selection, continuously optimizing AI processes for cost efficiency, and justifying AI spending based on business value.
A centralized AI platform team supports these decentralized efforts with a set of FinOps tools and practices that includes dashboards for real-time cost tracking and allocation, enabling teams to closely monitor their AI spending. Anomaly detection allows you to quickly identify and address unexpected cost spikes. Benchmarking tools facilitate efficiency comparisons across teams and use cases, encouraging healthy competition and knowledge sharing.
Conclusion
As more use cases emerge and AI becomes ubiquitous across business functions, organizations will be challenged to scale their AI initiatives cost-effectively. They can lay the groundwork for long-term success by establishing robust cost optimization techniques that allow them to innovate freely while ensuring sustainable growth. Success depends on perfecting the delicate balance between experimentation, performance, accuracy, and cost.”
If you’d like to learn more about using AWS and AI, please feel free to contact us: 416.429.0796 or 1.877.238.9944 (Toll Free) or email us: [email protected].