A Singapore Government Agency Website
How to identify
Official website links end with .gov.sg
Government agencies communicate via .gov.sg websites (e.g. go.gov.sg/open). Trusted websites
Secure websites use HTTPS
Look for a lock () or https:// as an added precaution. Share sensitive information only on official, secure websites.
LogoLogoHomeAboutFAQsEventsProblem Statements
LogoLogo
Sign up here

{build} Hackathon & Incubator

Are you ready to be part of the next {build}?

Contact UsReport VulnerabilityPrivacy StatementTerms of Use
GovTech 10th AnniversaryGovTech 10th Anniversary

© 2026 Government Technology Agency of Singapore | GovTech

Projects/Productivity
EvalAI

EvalAI

Your resourcing buddy that revolutionises the way proposals are evaluated, summarising key information and flagging gaps early, which enables multiple well-informed, strategic and consistent proposal evaluations at your fingertip

Booth PR11

Back to all projects

EvalAI

Your resourcing buddy, EvalAI, that revolutionises the way proposals are evaluated, summarising key information and flagging gaps early, which enables multiple well-informed, strategic and consistent proposal evaluations at your fingertip.

Summary

EvalAI was developed to address the challenges officers face when evaluating 7-12 proposals within a 1.5-week timeframe. Officers are required to read through each proposal, seek clarifications, and assess them thoroughly. However, proposals from agencies often vary significantly in format and content, making it difficult for officers to context-switch between them, especially given the tight deadlines. Additionally, different types of proposals require distinct evaluation criteria, adding to the complexity. Inconsistencies in evaluations can arise due to varying standards and experience levels among officers, potentially leading to oversights of critical information. These issues can result in inconsistent and incomplete evaluations, which may contribute to less informed decision-making.

Research and Approach

  1. We reached out to some ministries/ agencies and found that we were not the only ones facing the problem and there were no existing solutions; officers are manually reading and evaluating proposals based on their individual knowledge and experiences.
  2. With the strong support from business user, through interviewing the officer and observing how evaluations were being processed, we managed to identify some patterns and pain points which helped us define and scope the problem.
  3. To ensure comprehensive user research and get a deeper appreciation of the challenges and frustrations at various stages of the process, we collaborated with our business user to map out an officer's proposal evaluation journey.
  4. Upon synthesising our findings, we were able to identify and tackle the pressing challenge in the process, which is to deliver well-informed evaluations of multiple proposals within a short timeframe.
  5. After developing the prototype, we tested with 4 Resourcing Evaluators, and these were the results
    • 1.Output was generally inline with their expectations
    • 2.Rationale provided in additional assessment was crucial as evaluators needed to justify their support levels to their management
    • 3.Satisfaction rating: 4/6, with feedback on improving tonality and accuracy
    • 4.Likelihood of recommendation to other colleagues: 7/10

Solution Overview

We developed a GenAI solution with the following features:

  1. [Standardized Assessment Table] Consolidate key information into a standardized template that allows evaluators to focus on key aspect of the proposal.
  2. [Additional Assessment] Provide additional structured guidance in refining their support level
  3. [Chatbot using Agentic AI] Interactive chatbot that enables evaluators to:
    • Query proposal details and generate outputs
    • Updating specific section of the output directly
    • Identify weak justifications and suggest improvements based on evaluation guidelines

Tech Stack

CategoryProduct
Coding StackPython, Streamlit
GenAI FrameworkLangChain, PydanticAI
LLMGovText (LLM-as-a-Service)
GenAI Coding AssistantCodeium
HostingContainerStack
GuardrailAI Guardian

Outcome and Impact

Current StateTo-Be State
7-12 proposals per officer to evaluate with 1.5 working week. Cumulative 28 working days/officer (~$75K/resourcing cycle).Summarizing key information and flagging gaps. Cumulative 14 working days/officer (~$37.5K/resourcing cycle; 50% reduction).

Looking Ahead

  1. Expand test scope to include more user personas to discover potential use cases for EvalAI.
  2. Develop proof-of-concept (POC) to handle different agencies evaluation criteria.
  3. Explore scaling opportunities with MOF
  4. Market EvalAI to other Ministries/ Agencies to help with their proposal evaluation process
  5. Find opportunities to open up access for agencies to perform first-cut review before submitting proposals to the ministry – conduct user research, etc.

The Team

Left to right: Shawn Wang, Teo Peng Bin, James Chiang, Jason Han Shawn Wang, Teo Peng Bin, James Chiang, Jason Han