New top story on Hacker News: Show HN: Openlayer – Test, fix, and improve your ML models
Show HN: Openlayer – Test, fix, and improve your ML models
13 by vikasnair | 3 comments on Hacker News.
Hey HN, my name is Vikas, and my cofounders Rish, Gabe and I are building Openlayer: http://openlayer.com/ Openlayer is an ML testing, evaluation, and observability platform designed to help teams pinpoint and resolve issues in their models. We were ML engineers experiencing the struggle that goes into properly evaluating models, making them robust to the myriad of unexpected edge cases they encounter in production, and understanding the reasons behind their mistakes. It was like playing an endless game of whack-a-mole with Jupyter notebooks and CSV files — fix one issue and another pops up. This shouldn’t be the case. Error analysis is vital to establishing guardrails for AI and ensuring fairness across model predictions. Traditional software testing platforms are designed for deterministic systems, where a given input produces an expected output. Since ML models are probabilistic, testing them reliably has been a challenge. What sets Openlayer apart from other companies in the space is our end-to-end approach to tackling both pre- and post-deployment stages of the ML pipeline. This "shift-left" approach emphasizes the importance of thorough validation before you ship, rather than relying solely on monitoring after you deploy. Having a strong evaluation process pre-ship means fewer bugs for your users, shorter and more efficient dev-cycles, and lower chances of getting into a PR disaster or having to recall a model. Openlayer provides ML teams and individuals with a suite of powerful tools to understand models and data beyond your typical metrics. The platform offers insights about the quality of your training and validation sets, the performance of your model across subpopulations of your data, and much more. Each of these insights can be turned into a “goal.” As you commit new versions of your models and data, you can see how your model progresses towards these goals, as you guard against regressions you may have otherwise not picked up on and continually raise the bar. Here's a quick rundown of the Openlayer workflow: 1. Add a hook in your training / data ingestion pipeline to upload your data and model predictions to Openlayer via our API 2. Explore insights about your models and data and create goals around them [1] 3. Diagnose issues with the help of our platform, using powerful tools like explainability (e.g. SHAP values) to get actionable recommendations on how to improve 4. Track the progress over time towards your goals with our UI and API and create new ones to keep improving We've got a free sandbox for you to try out the platform today! You can sign up here: https://ift.tt/l9Y2ypa . We are also soon adding support for even more ML tasks, so please reach out if your use case is not supported and we can add you to a waitlist. Give Openlayer a spin and join us in revolutionizing ML development for greater efficiency and success. Let us know what you think, or if you have any questions about Openlayer or model evaluation in general. [1] A quick run-down of the categories of goals you can track: - Integrity goals measure the quality of your validation and training sets - Consistency goals guard against drift between your datasets - Performance goals evaluate your model's performance across subpopulations of the data - Robustness goals stress-test your model using synthetic data to uncover edge cases - Fairness goals help you understand biases in your model on sensitive populations
13
13 by vikasnair | 3 comments on Hacker News.
Hey HN, my name is Vikas, and my cofounders Rish, Gabe and I are building Openlayer: http://openlayer.com/ Openlayer is an ML testing, evaluation, and observability platform designed to help teams pinpoint and resolve issues in their models. We were ML engineers experiencing the struggle that goes into properly evaluating models, making them robust to the myriad of unexpected edge cases they encounter in production, and understanding the reasons behind their mistakes. It was like playing an endless game of whack-a-mole with Jupyter notebooks and CSV files — fix one issue and another pops up. This shouldn’t be the case. Error analysis is vital to establishing guardrails for AI and ensuring fairness across model predictions. Traditional software testing platforms are designed for deterministic systems, where a given input produces an expected output. Since ML models are probabilistic, testing them reliably has been a challenge. What sets Openlayer apart from other companies in the space is our end-to-end approach to tackling both pre- and post-deployment stages of the ML pipeline. This "shift-left" approach emphasizes the importance of thorough validation before you ship, rather than relying solely on monitoring after you deploy. Having a strong evaluation process pre-ship means fewer bugs for your users, shorter and more efficient dev-cycles, and lower chances of getting into a PR disaster or having to recall a model. Openlayer provides ML teams and individuals with a suite of powerful tools to understand models and data beyond your typical metrics. The platform offers insights about the quality of your training and validation sets, the performance of your model across subpopulations of your data, and much more. Each of these insights can be turned into a “goal.” As you commit new versions of your models and data, you can see how your model progresses towards these goals, as you guard against regressions you may have otherwise not picked up on and continually raise the bar. Here's a quick rundown of the Openlayer workflow: 1. Add a hook in your training / data ingestion pipeline to upload your data and model predictions to Openlayer via our API 2. Explore insights about your models and data and create goals around them [1] 3. Diagnose issues with the help of our platform, using powerful tools like explainability (e.g. SHAP values) to get actionable recommendations on how to improve 4. Track the progress over time towards your goals with our UI and API and create new ones to keep improving We've got a free sandbox for you to try out the platform today! You can sign up here: https://ift.tt/l9Y2ypa . We are also soon adding support for even more ML tasks, so please reach out if your use case is not supported and we can add you to a waitlist. Give Openlayer a spin and join us in revolutionizing ML development for greater efficiency and success. Let us know what you think, or if you have any questions about Openlayer or model evaluation in general. [1] A quick run-down of the categories of goals you can track: - Integrity goals measure the quality of your validation and training sets - Consistency goals guard against drift between your datasets - Performance goals evaluate your model's performance across subpopulations of the data - Robustness goals stress-test your model using synthetic data to uncover edge cases - Fairness goals help you understand biases in your model on sensitive populations
13
Comments
Post a Comment