Thursday, January 15, 2026

Scaling medical content material overview at Flo Well being utilizing Amazon Bedrock (Half 1)


This weblog publish is predicated on work co-developed with Flo Well being.

Healthcare science is quickly advancing. Sustaining correct and up-to-date medical content material instantly impacts folks’s lives, well being selections, and well-being. When somebody searches for well being data, they’re usually at their most weak, making accuracy not simply necessary, however probably life-saving.

Flo Well being creates hundreds of medical articles yearly, offering thousands and thousands of customers worldwide with medically credible data on ladies’s well being. Verifying the accuracy and relevance of this huge content material library is a major problem. Medical information evolves repeatedly, and handbook overview of every article isn’t solely time-consuming but in addition susceptible to human error. Because of this the crew at Flo Well being, the corporate behind the main ladies’s well being app Flo, is utilizing generative AI to facilitate medical content material accuracy at scale. By means of a partnership with AWS Generative AI Innovation Heart, Flo Well being is growing an revolutionary strategy, additional referred to as, “Medical Automated Content material Evaluate and Revision Optimization Resolution” (MACROS) to confirm and keep the accuracy of its in depth well being data library. This AI-powered answer is able to:

  1. Effectively processing giant volumes of medical content material based mostly on credible scientific sources.
  2. Figuring out potential inaccuracies or outdated data based mostly on credible scientific sources.
  3. Proposing updates based mostly on the most recent medical analysis and pointers, in addition to incorporating person suggestions.

The system powered by Amazon Bedrock permits Flo Well being to conduct medical content material evaluations and revision assessments at scale, making certain up-to-date accuracy and supporting extra knowledgeable healthcare decision-making. This technique performs detailed content material evaluation, offering complete insights on medical requirements and pointers adherence for Flo’s medical specialists to overview. It is usually designed for seamless integration with Flo’s current tech infrastructure, facilitating automated updates the place applicable.

This two-part sequence explores Flo Well being’s journey with generative AI for medical content material verification. Half 1 examines our proof of idea (PoC), together with the preliminary answer, capabilities, and early outcomes. Half 2 covers specializing in scaling challenges and real-world implementation. Every article stands alone whereas collectively exhibiting how AI transforms medical content material administration at scale.

Proof of Idea targets and success standards

Earlier than diving into the technical answer, we established clear goals for our PoC medical content material overview system:

Key Aims:

  • Validate the feasibility of utilizing generative AI for medical content material verification
  • Decide accuracy ranges in comparison with handbook overview
  • Assess processing time and price enhancements

Success Metrics:

  • Accuracy: Content material piece recall of 90%
  • Effectivity: Scale back detection time from hours to minutes per guideline
  • Price Discount: Scale back skilled overview workload
  • High quality: Preserve Flo’s editorial requirements and medical accuracy
  • Pace: 10x quicker than handbook overview course of

To confirm the answer meets Flo Well being’s excessive requirements for medical content material, Flo Well being’s medical specialists and content material groups had been working intently with AWS technical specialists by way of common overview periods, offering crucial suggestions and medical experience to repeatedly improve the AI mannequin’s efficiency and accuracy. The result’s MACROS, our custom-built answer for AI-assisted medical content material verification.

Resolution overview

On this part, we define how the MACROS answer makes use of Amazon Bedrock and different AWS providers to automate medical content material overview and revisions.

Determine 1. Medical Automated Content material Evaluate and Revision Optimization Resolution Overview

As proven in Determine 1, the developed answer helps two main processes:

  1. Content material Evaluate and Revision: Permits the medical requirements and magnificence adherence of current medical articles at scale given the pre-specified {custom} guidelines and pointers and proposes a revision that conforms to the brand new medical requirements in addition to Flo’s fashion and tone pointers.
  2. Rule Optimization: MACROS accelerates the method of extracting the brand new (medical) pointers from the (medical) analysis, pre-processing them into the format wanted for content material overview, in addition to optimizing their high quality.

Each steps might be carried out by way of the person interface (UI) in addition to the direct API name. The UI help permits medical specialists to instantly see the content material overview statistics, work together with adjustments, and do handbook changes. The API name help is meant for the mixing into pipeline for periodic evaluation.

Structure

Determine 2 depicts the structure of MACROS. It consists of two main components: backend and frontend.

Figure 2. MACROS architecture

Determine 2. MACROS structure

Within the following, the stream of main app parts is introduced:

1. Customers start by gathering and getting ready content material that should meet medical requirements and guidelines.

2. Within the second step, the info is supplied as PDF, TXT recordsdata or textual content by way of the Streamlit UI that’s hosted in Amazon Elastic Container Service (ECS). The authentication for file add occurs by way of Amazon API Gateway

3. Alternatively, {custom} Flo Well being JSON recordsdata might be instantly uploaded to the Amazon Easy Storage Service (S3) bucket of the answer stack.

4. The ECS hosted frontend has AWS IAM permissions to orchestrate duties utilizing AWS Step Capabilities.

5. Additional, the ECS container has entry to the S3 for itemizing, downloading and importing recordsdata both by way of pre-signed URL or boto3.

6. Optionally, if the enter file is uploaded by way of the UI, the answer invokes AWS Step Capabilities service that begins the pre-processing performance inside hosted by an AWS Lambda perform. This Lambda has entry to Amazon Textract for extracting textual content from PDF recordsdata. The recordsdata are saved in S3 and in addition returned to the UI.

7-9. Hosted on AWS Lambda, Rule Optimizer, Content material Evaluate and Revision features are orchestrated by way of AWS Step Perform. They’ve entry to Amazon Bedrock for generative AI capabilities to carry out rule extraction from unstructured information, content material overview and revision, respectively. Moreover, they’ve entry to S3 by way of boto3 SDK to retailer the outcomes.

10. The Compute Stats AWS Lambda perform has entry to S3 and may learn and mix the outcomes of particular person revision and overview runs.

11. The answer leverages Amazon CloudWatch for system monitoring and log administration. For manufacturing deployments coping with crucial medical content material, the monitoring capabilities may very well be prolonged with {custom} metrics and alarms to offer extra granular insights into system efficiency and content material processing patterns.

Future enhancements

Whereas our present structure makes use of AWS Step Capabilities for workflow orchestration, we’re exploring the potential of Amazon Bedrock Flows for future iterations. Bedrock Flows presents promising capabilities for streamlining AI-driven workflows, probably simplifying our structure and enhancing integration with different Bedrock providers. This various might present extra seamless administration of our AI processes, particularly as we scale and evolve our answer.

Content material overview and revision

On the core of MACROS lies its Content material Evaluate and Revision performance with Amazon Bedrock basis fashions. The Content material Evaluate and Revision block consists of 5 main parts: 1) The non-compulsory Filtering stage 2) Chunking 3) Evaluate 4) Revision and 5) Publish-processing, depicted in Determine 3.

Figure 3. Content review and revision pipeline

Determine 3. Content material overview and revision pipeline

Right here’s how MACROS processes the uploaded medical content material:

  1. Filtering (Non-obligatory): The journey begins with an non-compulsory filtering step. This sensible characteristic checks whether or not the algorithm is related for the article, probably saving time and sources on pointless processing.
  2. Chunking: The supply textual content is then cut up into paragraphs. This significant step facilitates good high quality evaluation and helps forestall unintended revisions to unrelated textual content. Chunking might be carried out utilizing heuristics, akin to punctuation or common expression-based splits, in addition to utilizing giant language fashions (LLM) to establish semantically full chunks of textual content.
  3. Evaluate: Every paragraph or part undergoes an intensive overview towards the related guidelines and pointers.
  4. Revision: Solely the paragraphs flagged as non-adherent transfer ahead to the revision stage, streamlining the method and sustaining the integrity of adherent content material. The AI suggests updates to convey non-adherent paragraphs in step with the most recent pointers and Flo’s fashion necessities.
  5. Publish-processing: Lastly, the revised paragraphs are seamlessly built-in again into the unique textual content, leading to an up to date, adherent doc.

The Filtering step might be carried out utilizing an extra LLM by way of Amazon Bedrock name that assesses every part individually with the next immediate construction:

A diagram illustrating a filtering prompt workflow. The layout consists of three main components connected by orange arrows. On the left, a rectangular box contains specific prompt text that reads: "You are a highly skilled language model designed to assess whether guidelines are relevant for a given article. I will give you the article text as well as the rules text. Your task is to analyze whether the provided guidelines are relevant for this article. If at least one rule is relevant, you must reply with 1. If no rules are relevant, please respond with 0." In the middle section labeled "Context", there are two bullet points listing "Article to assess" and "Rules/Guidelines", with the Amazon Bedrock logo displayed in turquoise below. The final component on the right shows "1/0 Relevance response" as the output. The entire workflow is enclosed within an orange dashed border frame.

Determine 4. Simplified LLM-based filtering step

Additional, non-LLM approaches might be possible to help the Filtering step:

  1. Encoding the principles and the articles into dense embedding vectors and calculating similarity between them. By setting the similarity threshold we are able to establish which rule set is taken into account to be related for the enter doc.
  2. Equally, the direct keyword-level overlap between the doc and the rule might be recognized utilizing BLEU or ROUGE metrics.

Content material overview, as already talked about, is carried out on a textual content part foundation towards group of guidelines and results in response in XML format, akin to:


 Part textual content with none adjustments 
 0 
 Textual content of the non-adherent rule 
 Motive why the part is non-adherent to the rule 
 Textual content of the non-adherent rule 
 Motive why the part is non-adherent to the rule 
 Part textual content with none adjustments 
 1 
 Part textual content with none adjustments 
 1 

Right here, 1 signifies adherence and 0 – non-adherence of the textual content to the desired guidelines. Utilizing XML format helps to attain dependable parsing of the output.

This Evaluate step iterates over the sections within the textual content to make it possible for the LLM pays consideration to every part individually, which led to extra strong leads to our experimentation. To facilitate larger non-adherent part detection accuracy, the person can even use the Multi-call mode, the place as an alternative of 1 Amazon Bedrock name assessing adherence of the article towards all guidelines, we’ve one impartial name per rule.

The Revision step receives the output of the Evaluate (non-adherent sections and the explanations for non-adherence), in addition to the instruction to create the revision in an identical tone. It then suggests revisions of the non-adherent sentences in a mode much like the unique textual content. Lastly, the Publish-processing step combines the unique textual content with new revisions, ensuring that no different sections are modified.

Totally different steps of the stream require completely different ranges of LLM mannequin complexity. Whereas easier duties like chunking might be completed effectively with a comparatively small mannequin like Claude Haiku fashions household, extra advanced reasoning duties like content material overview and revision require bigger fashions like Claude Sonnet or Opus fashions household to facilitate correct evaluation and high-quality content material technology. This tiered strategy to mannequin choice optimizes each efficiency and cost-efficiency of the answer.

Working modes

The Content material Evaluate and Revision characteristic operates in two UI modes: Detailed Doc Processing and Multi Doc Processing, every catering to completely different scales of content material administration. The Detailed Doc Processing mode presents a granular strategy to content material evaluation and is depicted in Determine 5. Customers can add paperwork in numerous codecs (PDF, TXT, JSON or paste textual content instantly) and specify the rules towards which the content material must be evaluated.

Determine 5. Detailed Doc Processing instance

Customers can select from predefined rule units, right here, Vitamin D, Breast Well being, and Premenstrual Syndrome and Dysphoric Dysfunction (PMS and PMDD), or enter {custom} pointers. These {custom} pointers can embrace guidelines akin to “The title of the article should be medically correct” in addition to adherent and non-adherent to the rule examples of content material.

The rulesets make it possible for the evaluation aligns with particular medical requirements and Flo’s distinctive fashion information. The interface permits for on-the-fly changes, making it preferrred for thorough, particular person doc evaluations. For larger-scale operations, the Multi Doc Processing mode must be used. This mode is designed to deal with quite a few {custom} JSON recordsdata concurrently, mimicking how Flo would combine MACROS into their content material administration system.

Extracting guidelines and pointers from unstructured information

Actionable and well-prepared pointers will not be at all times instantly obtainable. Generally they’re given in unstructured recordsdata or have to be discovered. Utilizing the Rule Optimizer characteristic, we are able to extract and refine actionable pointers from a number of advanced paperwork.

Rule Optimizer processes uncooked PDF paperwork to extract textual content, which is then chunked into significant sections based mostly on doc headers. This segmented content material is processed by way of Amazon Bedrock utilizing specialised system prompts, with two distinct modes: Model/tonality and Medical mode.

Model/tonality mode focuses on extracting the rules on how the textual content must be written, its fashion, what codecs and phrases can or can’t be used.

Rule Optimizer assigns a precedence for every rule: excessive, medium, and low. The precedence degree signifies the rule’s significance, guiding the order of content material overview and focusing consideration on crucial areas first. Rule Optimizer features a handbook modifying interface the place customers can refine rule textual content, modify classifications, and handle priorities. Subsequently, if customers have to replace a given rule, the adjustments are saved for future use in Amazon S3.

The Medical mode is designed to course of medical paperwork and is customized to a extra scientific language. It permits grouping of extracted guidelines into three courses:

  1. Medical situation pointers
  2. Therapy particular pointers
  3. Adjustments to recommendation and developments in well being
A diagram illustrating a medical rule optimization workflow titled "Rule Optimization Prompt". The layout consists of three main components connected by orange arrows. On the left, within a rectangular box, is the prompt text that reads: "You are an expert AI assistant specializing in extracting and optimizing rules from medical guideline documents." followed by placeholder sections for "Good rule description" and "Format description". Below this is a section labeled "Context" containing a bullet point for "Articles for rule extraction and optimization". In the center, the Amazon Bedrock logo is displayed in turquoise. The right component shows the expected output format with two numbered lines: "1: Rule Class, priority, text" and "2: Rule Class, priority, text". The entire workflow is enclosed within an orange dashed border frame. The diagram demonstrates a process for converting medical guidelines into structured rule formats.

Determine 6. Simplified medical rule optimization immediate

Determine 6 offers an instance of a medical rule optimization immediate, consisting of three important parts: position setting – medical AI skilled, description of what makes rule, and at last the anticipated output. We establish the sufficiently good high quality for a rule whether it is:

  • Clear, unambiguous, and actionable
  • Related, constant, and concise (max two sentences)
  • Written in lively voice
  • Avoids pointless jargon

Implementation issues and challenges

Throughout our PoC improvement, we recognized a number of essential issues that might profit others implementing comparable options:

  • Information preparation: This emerged as a elementary problem. We discovered the significance of standardizing enter codecs for each medical content material and pointers whereas sustaining constant doc constructions. Creating numerous check units throughout completely different medical subjects proved important for complete validation.
  • Price administration: Monitoring and optimizing price rapidly grew to become a key precedence. We carried out token utilization monitoring and optimized immediate design and batch processing to stability efficiency and effectivity.
  • Regulatory and moral compliance: Given the delicate nature of medical content material, strict regulatory and moral safeguards had been crucial. We established strong documentation practices for AI selections, carried out strict model management for medical pointers and steady human medical skilled oversight for the AI-generated ideas. Regional healthcare rules had been fastidiously thought of all through implementation.
  • Integration and scaling: We suggest beginning with a standalone testing setting whereas planning for future content material administration system (CMS) integration by way of well-designed API endpoints. Constructing with modularity in thoughts proved beneficial for future enhancements. All through the method, we confronted frequent challenges akin to sustaining context in lengthy medical articles, balancing processing velocity with accuracy, and facilitating constant tone throughout AI-suggested revisions.
  • Mannequin optimization: The various mannequin choice functionality of Amazon Bedrock proved significantly beneficial. By means of its platform, we are able to select optimum fashions for particular duties, obtain price effectivity with out sacrificing accuracy, and easily improve to newer fashions – all whereas sustaining our current structure.

Preliminary Outcomes

Our Proof of Idea delivered robust outcomes throughout the crucial success metrics, demonstrating the potential of AI-assisted medical content material overview. The answer exceeded goal processing velocity enhancements whereas sustaining 80% accuracy and over 90% recall in figuring out content material requiring updates. Most notably, the AI-powered system utilized medical pointers extra persistently than handbook evaluations and considerably diminished the time burden on medical specialists.

Key Takeaways

Throughout implementation, we uncovered a number of insights crucial for optimizing AI efficiency in medical content material evaluation. Content material chunking was important for correct evaluation throughout lengthy paperwork, and skilled validation of parsing guidelines helped medical specialists to keep up scientific precision.Most significantly, the challenge confirmed that human-AI collaboration – not full automation – is essential to profitable implementation. Common skilled suggestions and clear efficiency metrics guided system refinements and incremental enhancements. Whereas the system considerably streamlines the overview course of, it really works finest as an augmentation instrument, with medical specialists remaining important for ultimate validation, making a extra environment friendly hybrid strategy to medical content material administration.

Conclusion and subsequent steps

This primary a part of our sequence demonstrates how generative AI could make the medical content material overview course of quicker, extra environment friendly, and scalable whereas sustaining excessive accuracy. Keep tuned for Half 2 of this sequence, the place we cowl the manufacturing journey, deep diving into challenges and scaling methods.Are you prepared to maneuver your AI initiatives into manufacturing?


In regards to the authors

Liza (Elizaveta) Zinovyeva, Ph.D., is an Utilized Scientist at AWS Generative AI Innovation Heart and is predicated in Berlin. She helps prospects throughout completely different industries to combine Generative AI into their current functions and workflows. She is captivated with AI/ML, finance and software program safety subjects. In her spare time, she enjoys spending time together with her household, sports activities, studying new applied sciences, and desk quizzes.

Callum Macpherson is a Information Scientist on the AWS Generative AI Innovation Heart, the place cutting-edge AI meets real-world enterprise transformation. Callum companions instantly with AWS prospects to design, construct, and scale generative AI options that unlock new alternatives, speed up innovation, and ship measurable influence throughout industries.

Arefeh Ghahvechi is a Senior AI Strategist on the AWS GenAI Innovation Heart, specializing in serving to prospects notice speedy worth from generative AI applied sciences by bridging innovation and implementation. She identifies high-impact AI alternatives whereas constructing the organizational capabilities wanted for scaled adoption throughout enterprises and nationwide initiatives.

Nuno Castro is a Sr. Utilized Science Supervisor. He’s has 19 years expertise within the area in industries akin to finance, manufacturing, and journey, main ML groups for 11 years.

Dmitrii Ryzhov is a Senior Account Supervisor at Amazon Net Companies (AWS), serving to digital-native firms unlock enterprise potential by way of AI, generative AI, and cloud applied sciences. He works intently with prospects to establish high-impact enterprise initiatives and speed up execution by orchestrating strategic AWS help, together with entry to the fitting experience, sources, and innovation applications.

Nikita Kozodoi, PhD, is a Senior Utilized Scientist on the AWS Generative AI Innovation Heart engaged on the frontier of AI analysis and enterprise. Nikita builds and deploys generative AI and ML options that clear up real-world issues and drive enterprise influence for AWS prospects throughout industries.

Aiham Taleb, PhD, is a Senior Utilized Scientist on the Generative AI Innovation Heart, working instantly with AWS enterprise prospects to leverage Gen AI throughout a number of high-impact use instances. Aiham has a PhD in unsupervised illustration studying, and has trade expertise that spans throughout numerous machine studying functions, together with pc imaginative and prescient, pure language processing, and medical imaging.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles