Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Log in

Have no account yet? Sign up

Create an account

Already have an account? Log in

Reset password

Remember your password? Log in

Terms of use

SingularityByte.com values the privacy of our users. Therefore, this privacy policy explains in detail how we use and protect the information we collect when you visit our website.. Read this privacy policy completely. Please refrain from visiting the site if the terms outlined below are not satisfactory to you. We reserve the right to change this policy at any time and will list these changes in the updates section of the policy. By reading this notice and visiting the site, you agree that you understand that customers will not be personally notified when this policy changes. Therefore, we advise our customers to frequently review our privacy policy so that they remain aware of its updates. By using the site, you accept that the posted policy and all its changes apply to your interaction with SingularityByte.com.

Information Collected by SingularityByte.com

Personal information may be collected by this site in many ways. This information includes:

Personal identifying information like your name, address, email, phone number, age, gender, and other personal data
Server data related to the IP address you used to visit our website, which includes your address, browser, OS, access time, and site activity.
Financial information related to your orders including your payment method and identifying payment information. We rarely store financial information collected on our site for transaction purposes. That information gets sent directly to our payment processor.
Social network data including Facebook permissions and user information from other networks, provided you log onto our site using one of these media sites.
Mobile device information such as your device ID, model, and location, if you use our site by accessing trough our website.

How We Use This Information

Our website uses information collected to:
• Manage your account information
• Customize ads
• Deliver promotions
• Email your account confirmation
• Manage purchases and payments
• Increase site efficiency
• Notify you of updates
• Offer new products
• Monitor and prevent theft
• Request your customer feedback
• Resolve account disputes
• Respond to your service requests

Information Disclosure

Normally, your information stays on our site. However, below we have listed the situations that may
require us to share the information we collect from you:
• When required by law, such as for fraud protection
• With our third-party providers for payment processing and hosting
• With your consent for marketing purposes
• When you post comments on the site
• To our advertisers, affiliates, and partners
• If this site goes bankrupt and data must be transferred

Cookies, Trackers, and Online Ads

We may use cookies, trackers, web beacons, and other technology to customize our website to improve your experience. We may customize the site using this information. These trackers do not have access to your personal information and can be removed from your browser options. In addition, third-party software provides ads for our site for marketing campaigns. These programs have access to tracking technology to optimize your ad experience. For more information about these
ads, visit [link to the privacy policies of affiliate advertisers]. Website analytics such as through Google Analytics may also be used to track users
and remarket our website. We do not give these vendors access to your personal information.

Other Sites

Our website may contain links to third-party websites in the form of policies, ads, and other non-affiliated links. Once you leave our site, we are no longer responsible for how your information is collected and disclosed. Please refer to the privacy policies of those third-party sites for more information.

Information Security

We take technical and administrative precautions to protect your data, but we cannot guarantee its safety against all types of fraud or misuse. If you provide personal information, we cannot verify its total security against all types of interception.

Do-Not-Track

Some browsers offer Do-Not-Track settings to prevent any information from being distributed. Since these settings have not been legally established as standard practice, we do acknowledge these settings.

Additional Options

At any time, you may opt to review or change your account settings, including contact information. If you wish to delete your account, you may do so to remove most of your information, however, some identifying information will be retained to prevent fraud.
You may also opt-out of emails and other correspondences from our site at any time.

Microsoft Clarity

We partner with Microsoft Clarity and Microsoft Advertising to capture how you use and interact with our website through behavioral metrics, heatmaps, and session replay to improve and market our products/services. Website usage data is captured using first and third-party cookies and other tracking technologies to determine the popularity of products/services and online activity. Additionally, we use this information for site optimization, fraud/security purposes, and advertising. For more information about how Microsoft collects and uses your data, visit the Microsoft Privacy Statement.

Contact Us

If you have questions or concerns about this privacy policy, please feel free to contact us at: desk@SingularityByte.com

Do you agree to our terms? Sign up

Convergence AI - Vision-Language

Proxy Lite-3B

Explore Proxy Lite-3B, a 3B-parameter open-source Vision-Language Model for efficient web automation. Achieve a 72.4% success rate with ease!
2025-02-25
Updated 2025-03-13 09:23:36

TABLE OF CONTENTS

Key Points

- Proxy Lite-3B is a 3-billion-parameter open-source Vision-Language Model (VLM) for web automation, released by Convergence AI on February 25, 2025, and available on Hugging Face.

- It seems likely that it performs well in UI navigation, achieving a 72.4% success rate on the WebVoyager benchmark, leading among open-weights models.

- Research suggests it’s efficient, using fewer resources, and is finetuned from Qwen/Qwen2.5-VL-3B-Instruct, making it accessible for developers.

Comprehensive Analysis of Proxy Lite-3B: A Deep Dive into Its Capabilities and Implications

What is Proxy Lite-3B?

Proxy Lite-3B is a compact AI model designed for web automation tasks, such as filling forms or clicking buttons on websites. It’s open-source, meaning anyone can use, modify, or distribute it, which is great for developers looking to build AI applications without heavy computational needs.

Why It Matters

This model stands out because it’s small yet powerful, with a 72.4% success rate on the WebVoyager benchmark, which tests real-world web interactions. It’s also efficient, using fewer resources than larger models, making it practical for many users. An unexpected detail is its framework for VLM-browser interaction, using tools like Playwright for precise web navigation, which could open new ways to automate daily tasks.

How to Use It

You can host Proxy Lite-3B locally using vLLM, with instructions available on its GitHub page (proxy-lite). It’s recommended for production use to avoid the slower demo endpoint on Hugging Face (proxy-lite-3b).

Introduction

On February 25, 2025, Convergence AI released Proxy Lite-3B, a 3-billion-parameter Vision-Language Model (VLM) tailored for web automation tasks, marking a significant milestone in open-source AI development. Hosted on Hugging Face (proxy-lite-3b), this model is designed to navigate and interact with web interfaces, offering a compact yet efficient solution for developers and AI enthusiasts. This article explores Proxy Lite-3B’s architecture, performance, applications, and potential, providing a thorough examination for those interested in its technical and practical implications.

Background and Release

Proxy Lite-3B was announced by Convergence AI on their website (proxy_lite), emphasizing its role as a mini, open-weights version of their Proxy assistant. The release aligns with the growing trend of democratizing AI through open-source models, enabling broader access to advanced web automation capabilities. Its availability on Hugging Face (proxy-lite-3b) ensures easy access for the global developer community, with a GitHub repository (proxy-lite) providing additional resources and setup instructions.

Model Architecture and Technical Details

Proxy Lite-3B is finetuned from Qwen/Qwen2.5-VL-3B-Instruct, a 3B parameter VLM known for processing both visual and textual inputs. This base model allows Proxy Lite-3B to interpret web page visuals and text, enabling tasks like clicking buttons or filling forms. With a model size of 3.75B parameters and using BF16 tensor type, it’s designed for efficiency, requiring less computational power compared to larger models. The model’s architecture includes a framework for VLM-browser interaction, leveraging the `Runner` class and `BrowserTool` class for precise web navigation, as detailed in its GitHub repository (proxy-lite). This framework uses Playwright for browser control, with actions defined by `mark_id`s for interacting with web elements.

Performance on WebVoyager Benchmark

The WebVoyager benchmark, introduced by He et al. in their paper (WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models), evaluates AI agents on real-world web tasks across 15 popular websites. Proxy Lite-3B achieved a 72.4% success rate, as noted on its Hugging Face page (proxy-lite-3b), positioning it as the leader among open-weights models. Detailed performance metrics, provided by Convergence AI (proxy_lite), show varying success rates across websites, as seen in the table below:

Website Name	Success Rate (%)	Finish Rate (%)	Avg. Messages
Allrecipes	87.8	95.1	10.3
Amazon	70.0	95.0	7.1
Apple	82.1	89.7	10.7
ArXiv	60.5	79.1	16.0
BBC News	69.4	77.8	15.9
Booking	70.0	85.0	24.8
Cambridge Dictionary	86.0	97.7	5.7
Coursera	82.5	97.5	4.7
ESPN	53.8	97.5	14.9
GitHub	85.0	92.5	10.0
Google Flights	38.5	51.3	34.8
Google Map	78.9	94.7	9.6
Google Search	71.4	92.9	6.0
Huggingface	68.6	74.3	18.4
Wolfram Alpha	78.3	93.5	6.1

These results, with full trajectories available at (eval trajectories), underscore its capability to handle diverse web tasks, though performance varies by website complexity.

Efficiency and Resource Use

One of Proxy Lite-3B’s standout features is its efficiency. With only 3B parameters, it uses a fraction of the computational resources required by larger models, making it suitable for deployment on devices with limited hardware. The GitHub repository (proxy-lite) recommends hosting it locally using vLLM, with a command like `vllm serve --model convergence-ai/proxy-lite-3b --trust-remote-code --enable-auto-tool-choice --tool-call-parser hermes --port 8008`, ensuring optimal performance. This efficiency is particularly beneficial for developers looking to integrate AI into applications without significant infrastructure costs.

Applications and Use Cases

Proxy Lite-3B’s primary application is web automation, enabling tasks such as:

- Automating repetitive web tasks, like form filling or scheduling appointments.

- Web scraping and data collection, offering a more intelligent approach compared to traditional methods.

- Enhancing AI assistants to interact directly with web content, expanding their functionality beyond text-based interactions.

Its open-source nature allows developers to fine-tune it for specific needs, potentially extending its use to custom web automation workflows. For example, it can assist in automating customer support tasks on e-commerce websites or streamline data entry processes.

Comparison with Other Models

Compared to other open-source models, Proxy Lite-3B’s 72.4% success rate on WebVoyager positions it as a leader among its peers, especially given its small size. Proprietary models, like those from Anthropic, may achieve higher rates, but they often require significant resources. Browser Use, another open-source agent mentioned in recent discussions (Browser Use AI model), claims a 89.1% success rate, but it’s a broader agent framework, not directly comparable to Proxy Lite-3B’s VLM focus. This comparison highlights Proxy Lite-3B’s niche strength in efficient, open-source web automation.

Getting Started and Community Engagement

To get started with Proxy Lite-3B, users can clone the GitHub repository (proxy-lite) and follow the setup instructions. The repository includes a demo API endpoint (demo-api) for initial testing, though it’s noted as unsuitable for production due to potential slowness under load. For production use, hosting locally with vLLM is recommended, with detailed commands provided. The community can engage through Convergence AI’s resources, though specific forums or Discord channels weren’t detailed in the release, suggesting users check the GitHub for contribution guidelines.

Limitations and Challenges

While Proxy Lite-3B offers significant advantages, it faces challenges such as anti-bot measures on websites, mitigated by using `playwright_stealth` and network proxies, especially in headless mode. Some tasks, like those on Google Flights, show lower success rates (38.5%), indicating areas for improvement in handling complex dynamic content.

Conclusion

Proxy Lite-3B represents a pivotal advancement in open-source AI for web automation, offering a balance of performance, efficiency, and accessibility. Its 72.4% success rate on the WebVoyager benchmark, coupled with its compact size, makes it a valuable tool for developers and researchers. As the AI community continues to build upon this model, we can anticipate further innovations in web interaction and automation, potentially transforming how we engage with digital interfaces.

Key Citations

- proxy-lite-3b Convergence AI Hugging Face model page

- Announcing Proxy Lite open-weights version Convergence AI website

- proxy-lite GitHub repository for setup and details

- WebVoyager Building an End-to-End Web Agent with Large Multimodal Models research paper

- Browser Use open-source AI agent for web automation InfoWorld article

- eval trajectories for Proxy Lite performance on WebVoyager

- demo-api Hugging Face spaces for Proxy Lite testing