Rho site logo

Rho Knows Clinical Research Services

What We Learned at PhUSE US Connect

Posted by Brook White on Tue, Jun 12, 2018 @ 09:40 AM

ryan-baileyRyan Bailey, MA is a Senior Clinical Researcher at Rho.  He has over 10 years of experience conducting multicenter asthma research studies, including the Inner City Asthma Consortium (ICAC) and the Community Healthcare for Asthma Management and Prevention of Symptoms (CHAMPS) project. Ryan also coordinates Rho’s Center for Applied Data Visualization, which develops novel data visualizations and statistical graphics for use in clinical trials.

Last week, PhUSE hosted its first ever US Connect conference in Raleigh, NC. Founded in Europe in 2004, the independent, non-profit Pharmaceutical Users Software Exchange has been a rapidly growing presence and influence in the field of clinical data science. While PhUSE routinely holds smaller events in the US, including their popular Computational Science Symposia and Single Day Events, this was the first time they had held a large multi-day conference with multiple work streams outside of Europe. The three-day event attracted over 580 data scientists, biostatisticians, statistical programmers, and IT professionals from across the US and around the world to focus on the theme of "Transformative Current and Emerging Best Practices."

After three days immersed in data science, we wanted to provide a round-up of some of the main themes of the conference and trends for our industry.

Emerging Technologies are already Redefining our Industry

emerging technologyIt can be hard to distinguish hype from reality when it comes to emerging technologies like big data, artificial intelligence, machine learning, and blockchain.  Those buzzwords made their way into many presentations throughout the conference, but there was more substance than I expected.  It is clear that many players in our industry (FDA included) are actively exploring ways to scale up their capabilities to wrangle massive data sets, rely on machines to automate long-standing data processing, formatting, and cleaning processes, and use distributed database technologies like blockchain to keep data secure, private, and personalized.  These technologies are not just reshaping other sectors like finance, retail, and transportation; they are well on their way to disrupting and radically changing aspects of clinical research.

The FDA is Leading the Way

Our industry has gotten a reputation for being slow to evolve, and we sometimes use the FDA as our scapegoat. Regulations take a long time to develop, formalize, and finalize, and we tend to be reluctant to move faster than regulations. However, for those that think the FDA is lagging behind in technological innovation and data science, US Connect was an eye opener. With 30 delegates at the conference and 16 presentations, the agency had a strong and highly visible presence.

Moreover, the presentations by the FDA were often the most innovative and forward-thinking. Agency presenters provided insight into how the offices of Computational Science and Biomedical Informatics are applying data science to aid in reviewing submissions for data integrity and quality, detecting data and analysis errors, and setting thresholds for technical rejection of study data. In one presentation, the FDA demonstrated its Real-time Application for Portable Interactive Devices (RAPID) to show how the agency is able to track key safety and outcomes data in real time amid the often chaotic and frantic environment of a viral outbreak. RAPID is an impressive feat of technical engineering, managing to acquire massive amounts of unstructured symptom data from multiple device types in real time, process them in the cloud, and perform powerful analytics for "rapid" decision making. It is the type of ambitious technically advanced project you expect to see coming out of Silicon Valley, not Silver Spring, MD.

It was clear that the FDA is striving to be at the forefront of bioinformatics and data science, and in turn, they are raising expectations for everyone else in the industry.

The Future of Development is "Multi-lingual"  

A common theme through all the tracks is the need to evolve beyond narrowly focused specialization in our jobs. Whereas 10-15 years ago, developing deep expertise in one functional area or one tool was a good way to distinguish yourself as a leader and bring key value to your organization, a similar approach may hinder your career in the evolving clinical research space. Instead, many presenters advocated that the data scientist of the future specialize in a few different tools and have broad domain knowledge. As keynote speaker Ian Khan put it, we need to find a way to be both specialists and generalists at the same time. Nowhere was this more prevalent than in discussions around which programming languages will dominate our industry in the years to come.

While SAS remains the go-to tool for stats programming and biostatistics, the general consensus is that knowing SAS alone will not be adequate in years to come. The prevailing languages getting the most attention for data science are R and Python. While we heard plenty of debate about which one will emerge as the more prominent, it was agreed that the ideal scenario would be to know at least one, R or Python, in addition to SAS.

We Need to Break Down Silos and Improve our Teams

data miningOn a similar note, many presenters advocated for rethinking our traditional siloed approach to functional teams. As one vice president of a major Pharma company put it, "we have too much separation in our work - the knowledge is here, but there's no crosstalk." Rather than passing deliverables between distinct departments with minimal communication, clinical data science requires taking a collaborative multi-functional approach. The problems we face can no longer be parsed out and solved in isolation. As a multi-discipline field, data science necessarily requires getting diverse stakeholders in the room and working on problems together.

As for how to achieve this collaboration, Dr. Michael Rappa delivered an excellent plenary session on how to operate highly productive data science teams based on his experience directing the Institute for Advanced Analytics at North Carolina State University. His advice bucks the traditional notion that you solve a problem by selecting the most experienced subject matter experts and putting them in a room together. Instead, he demonstrated how artfully crafted teams that value leadership skills and motivation over expertise alone can achieve incredibly sophisticated and innovative output.

Change Management is an Essential Need

Finally, multiple sessions addressed the growing need for change management skills. As the aforementioned emerging technologies force us to acquire new knowledge and skills and adapt to a changing landscape, employees will need help to deftly navigate change. When asked what skills are most important for managers to develop, a VP from a large drug manufacturer put it succinctly, "our leaders need to get really good at change management."

In summary, PhUSE US Connect is helping our industry look to the future, especially when it comes to clinical data science, but the future may be closer than we think. Data science is not merely an analytical discipline to be incorporated into our existing work; it is going to fundamentally alter how we operate and what we achieve in our trials. The question for industry is if we're paying attention and pushing ourselves to evolve in step to meet those new demands.

Webinar: Understanding the FDA Guidance on Data Standards

Heat Maps for Database Lock

Posted by Brook White on Tue, Aug 08, 2017 @ 11:50 AM

Kristen Mason, Senior BiostatisticianKristen Mason, MS, is a Senior Biostatistician at Rho. She has over 4 years of experience providing statistical support for studies conducted under the Immune Tolerance Network (ITN) and Clinical Trials in Organ Transplantation (CTOT). She has a particular interest in data visualization, especially creating visualizations within SAS using the graph template language (GTL). 

Heather Kopetskie, Senior BiostatisticianHeather Kopetskie, MS, is a Senior Biostatistician at Rho. She has over 10 years of experience in statistical planning, analysis, and reporting for Phase 1, 2 and 3 clinical trials and observational studies. Her research experience includes over 8 years focusing on solid organ and cell transplantation through work on the Immune Tolerance Network (ITN)and Clinical Trials in Organ Transplantation (CTOT) project.  In addition, Heather serves as Rho’s biostatistics operational service leader, an internal expert sharing biostatistical industry trends, best practices, processes and training.

Preparing a database for lock can be a burdensome process. It requires coordinated effort from an entire clinical study team, including, but not limited to, the clinical data manager, study monitor, biostatistician, clinical project manager, principal investigator, and medical monitor. The team must work together to ensure the accuracy and reliability of the data, but with so many sites, subjects, visits, case report forms (CRFs), and data points it can be difficult to stay on top of the entire process. 

Using existing metadata (see Mining Metadata for Clinical Research Activities for more information on metadata) graphics can be created to visually represent the overall status of each requirement for database lock. This is possible using a graphic called a ‘heat map’ that displays the CRF metadata. The resulting graphic is shown below. 

heat map showing CRF metadata for database lock

The graphic has one row per subject and one column for each CRF collected at each visit. This results in one ‘box’ per subject per visit per CRF. Each box is colored and/or annotated to indicate the current status of each CRF. 

Broadly speaking, a quick glance at this graphic can show the clinical study team exactly how many CRFs have yet to be completed, where queries have not yet been closed, which CRFs have been source data verified, and whether or not an individual CRF has been locked.  Not to mention, all of this information can be identified for a specific subject at a specific visit for a specific CRF. 

Focusing on the details of our particular example, it is easy to see that no subject has yet initiated data entry for both Visit 4 and Visit 5. Additionally, three subjects have not started data entry for the Treatment Visit, ten for Visit 1, fifteen for Visit 2, and twenty-four for Visit 3. An open query remains for several subjects on the TRT form at the Treatment Visit, and for just subject 88528 on the PE form at the Screening Visit. A handful of forms have been source verified and no CRFs have been locked. Additionally, the graphic provides detail on the total number of subjects, visits, and CRFs for the study. This helps reveal specifics such as which visits are more burdensome with multiple CRFs and exactly how far along the subjects are in the study. 

Historically, this information has been conveyed through pages and pages of multiple listings, which can take minutes if not hours to decipher. Having all of the information in a single snapshot can help determine what steps need to be taken to get to database lock quickly and accurately. 

Further instruction on how to implement this graphic within SAS will be available soon. 

Post-Lock Data Flow: From CRF to FDA

An Interactive Suite of Data Visualizations for Safety Monitoring

Posted by Brook White on Thu, Feb 23, 2017 @ 01:42 PM

This is the fourth in a series of posts introducing open source tools Rho is developing and sharing online. Click here to learn more about Rho's open source effort, here to read about our interactive data visualization library, Webcharts, and here to learn about SAS graphing tools we've developed.

Frequent and careful monitoring of patient safety is one of the most important concerns of any clinical trial. For the medical monitors and safety monitoring committees responsible for supervising patient well-being and ensuring product safety, this obligation requires continuous access to a variety of critical study data.

For trials with large participant enrollment, severe diseases, or complex treatments, study monitors may be tasked with reviewing thousands of data points and safety markers. Unfortunately, traditional reporting methods require monitors to comb through scores of static listings and summary tables. This method is inefficient and poses the risk that clinically-relevant signals will be obscured by the sheer volume of data common in clinical trials.

To improve safety monitoring, we created a suite of interactive data monitoring tools we call the Safety Explorer. Although the safety explorer can be configured to include a variety of charts specific to each study, the standard set-up includes 6 charts (click the links to learn more):

  • Adverse Events Explorer - dynamically query adverse event (AE) data in real time to go from study population view to individual patient records
  • Adverse Events Timeline - view interactive timelines for each participant showing when AEs occurred in a trial
  • Test Results Histogram- explore interactive histograms showing distribution of labs, vital signs, and other safety measures with linked data tables
  • Test Results Outlier Explorer - track patient trajectories over time for lab measures, vital signs, and other safety endpoints in line charts
  • Test Results Over Time - explore population averages for labs, vital signs, and other safety endpoints in box or violin plots
  • Shift Plot - monitor changes in lab measures, vital signs, and other safety endpoints between study events in a dot plot

The safety explorer utilizes common CDISC data standards to quickly create consistent charts for any project. Within a given chart, users can use filters to dynamically sort, highlight, and drill down to data points of interest using controls familiar to anyone who has used a website.

Interactive Histogram with Linked Table

interactive histogram safety data

Explore the distribution of test results (click here for interactive version)

Graphical representations of data grant reviewers a systematic snapshot of the data that helps tell the story of the information. By adding interactive elements, reviewers can quickly examine the charts for patterns of interest and drill down to subject-level data instantly. This ability to quickly distinguish signal from noise, gives monitors greater insight into their data and allows them to work much more efficiently.

It is common practice for us to create safety explorers for all full service projects and studies where Rho provides medical monitoring. All of the charts described here are open source and free to use, so please let us know if you have any feedback, or would like to contribute!

Interactive Box Plot Showing Results Over Time

interactive box plot showing results over time

Track changes in population test results through a study (click here for interactive version)

View "Visualizing Multivariate Data" Video

Ryan Bailey, Senior Clinical ResearcherRyan Bailey, MA is a Senior Clinical Researcher at Rho.  He has over 10 years of experience conducting multicenter asthma research studies, including theInner City Asthma Consortium (ICAC) and the Community Healthcare for Asthma Management and Prevention of Symptoms (CHAMPS) project. Ryan also coordinates Rho’s Center for Applied Data Visualization, which develops novel data visualizations and statistical graphics for use in clinical trials.

Using SAS to Create Novel Data Visualizations

Posted by Brook White on Tue, Feb 07, 2017 @ 12:59 PM

Ryan Bailey, Senior Clinical ResearcherRyan Bailey, MA is a Senior Clinical Researcher at Rho.  He has over 10 years of experience conducting multicenter asthma research studies, including theInner City Asthma Consortium (ICAC) and the Community Healthcare for Asthma Management and Prevention of Symptoms (CHAMPS) project. Ryan also coordinates Rho’s Center for Applied Data Visualization, which developsnovel data visualizations and statistical graphics for use in clinical trials.

Shane Rosanbalm, Senior BiostatisticianShane Rosanbalm, MS, Senior Biostatistician, has over fifteen years of experience providing statistical support for clinical trials in all phases of drug development, from Phase I studies through NDA submissions.  He has collaborated with researchers in several areas including neonatal sepsis, RA, oncology, chronic pain, hypertension, and Parkinson’s disease.  He is the lead SAS developer on Rho’s Center for Applied Data Visualization, where he develops tools and publishes on best practices for visualizing and reporting data.

This is the third in a series of posts introducing open source tools Rho is developing and sharing online. Click here to learn more about Rho's open source effort.

In our last post, we introduced Webcharts, one of our many interactive web-based charting tools that uses D3. In addition to the many web-based tools that Rho has on GitHub, we also maintain a number of SAS®-based graphics repositories. In fact, our strong reputation for clinical biostatistics and expertise with SAS (and SAS graphing tools) long predated our development of web graphics.

A sampling of some of our SAS tools is provided below, but we invite you to visit GitHub and check out our full offering of SAS tools. You can use the Find a repository... Search bar to search for "SAS". All of our SAS repositories begin with "sas-".


sas codebook

SAS codebook

The SAS codebook macro is designed to provide a quick and concise summary of every variable in a SAS dataset. In addition to information about variable names, labels, types, formats, and statistics, the macro also produces a small graphic showing the distribution of values for each variable. This report is a convenient way to provide a snapshot of your data and quickly get to know a new dataset.

Violin Plot

violin plot

The SAS violin plot macro is designed to allow for a quick assessment of how the distribution of a variable changes from one group to another. Think of it as a souped-up version of a box and whisker plot. In addition to seeing the median, quartiles, and min/max, you also get to see all of the individual data points as well as the density curves associated with the distributions.

Sankey Bar Chart

sankey bar chart

The SAS Sankey bar chart macro is an enhancement of a traditional stacked bar chart. In addition to showing how many subjects are in each category over time, this graphic also shows you how subjects transition from one category to another over time.

Other SAS graphics tools include a Beeswarm Plot (a strip plot with non-random jittering) and the Axis Macro for automating the selection of axis ranges for continuous variables. We are adding new SAS repositories frequently. We invite you to try the tools, share your feedback, and contribute to the development of the tools.

Visit Rho's Center for Applied Data Visualization

Webcharts: A Reusable Tool for Building Online Data Visualizations

Posted by Brook White on Wed, Jan 18, 2017 @ 01:39 PM


This is the second in a series of posts introducing open source tools Rho is developing and sharing online. Click here to learn more about Rho's open source effort.

When Rho created a team dedicated developing novel data visualization tools for clinical research, one of the group's challenges was to figure out how to scale our graphics to every trial, study, and project we work on. In particular, we were interested in providing interactive web-based graphics, which can run in a browser and allow for intuitive, real-time data exploration.

Our solution was to create Webcharts - a web-based charting library built on top of the popular Data-Driven Documents (D3) JavaScript library - to provide a simple way to create reusable, flexible, interactive charts.

Interactive Study Dashboard

interactive study dashboard--webcharts

Track key project metrics in a single view; built with Webcharts (click here for interactive version)

Webcharts allows users to compose a wide range of chart types, ranging from basic charts (e.g., scatter plots, bar charts, line charts), to intermediate designs (e.g., histograms, linked tables, custom filters), to advanced displays (e.g., project dashboards, lab results trackers, outcomes explorers, and safety timelines). Webcharts' extensible and customizable charting library allows us to quickly produce standard charts while also crafting tailored data visualizations unique to each dataset, phase of study, and project.

This flexibility has allowed us to create hundreds of custom interactive charts, including several that have been featured alongside Rho's published work. The Immunologic Outcome Explorer (shown below) was adapted from Figure 3 in the New England Journal of Medicine article, Randomized Trial of Peanut Consumption in Infants at Risk for Peanut Allergy. The chart was originally created in response to reader correspondence, and was later updated to include follow-up data in conjunction with a second article, Effect of Avoidance on Peanut Allergy after Early Peanut Consumption. The interactive version allows the user to select from 10 outcomes on the y-axis. Selections for sex, ethnicity, study population, skin prick test stratum, and peanut specific IgE at 60 and 72 months of age can be interactively chosen to filter the data and display subgroups of interest. Figure options (e.g., summary lines, box and violin plots) can be selected under the Overlays heading to alter the properties of the figure.

Immunologic Outcome Explorer

immunologic outcome explorer using webcharts

Examine participant outcomes for the LEAP study (click here for interactive version)

Because Webcharts is designed for the web, the charts require no specialized software. If you have a web browser (e.g., Firefox, Chrome, Safari, Internet Explorer) and an Internet connection, you can see the charts. Likewise, navigating the charts is intuitive because we use controls familiar to anyone who has used a web browser (radio buttons, drop-down menus, sorting, filtering, mouse interactions). A manuscript describing the technical design of Webcharts was recently published in the Journal of Open Research Software.

The decision to build for general web use was intentional. We were not concerned with creating a proprietary charting system - of which there are many - but an extensible, open, generalizable tool that could be adapted to a variety of needs. For us, that means charts to aid in the conduct of clinical trials, but the tool is not limited to any particular field or industry. We also released Webcharts open source so that other users could contribute to the tools and help us refine them.

Because they are web-based, charts for individual studies and programs are easily implemented in RhoPORTAL, our secure collaboration and information delivery portal which allows us to share the charts with study team members and sponsors while carefully limiting access to sensitive data.

Webcharts is freely available online on Rho's GitHub site. The site contains a wiki that describes the tool, an API, and interactive examples. We invite anyone to download and use Webcharts, give us feedback, and participate in its development.

View "Visualizing Multivariate Data" Video

Jeremy Wildfire, MS, Senior Biostatistician, has over ten years of experience providing statistical support for multicenter clinical trials and mechanistic studies related to asthma, allergy, and immunology.  He is the head of Rho’s Center for Applied Data Visualization, which develops innovative data visualization tools that support all phases of the biomedical research process. Mr. Wildfire also founded Rho’s Open Source Committee, which guides the open source release of dozens of Rho’s graphics tools for monitoring, exploring, and reporting data. 

Ryan Bailey, MA is a Senior Clinical Researcher at Rho.  He has over 10 years of experience conducting multicenter asthma research studies, including theInner City Asthma Consortium (ICAC) and the Community Healthcare for Asthma Management and Prevention of Symptoms (CHAMPS) project. Ryan also coordinates Rho’s Center for Applied Data Visualization, which developsnovel data visualizations and statistical graphics for use in clinical trials.

Tips for Effective Enrollment Tracking

Posted by Brook White on Thu, Jan 05, 2017 @ 10:05 AM

Heather Kopetskie, Senior BiostatisticianHeather Kopetskie, MS, is a Senior Biostatistician at Rho. She has over 10 years of experience in statistical planning, analysis, and reporting for Phase 1, 2 and 3 clinical trials and observational studies. Her research experience includes over 8 years focusing on solid organ and cell transplantation through work on the Immune Tolerance Network (ITN) and Clinical Trials in Organ Transplantation (CTOT) project.  In addition, Heather serves as Rho’s biostatistics operational service leader, an internal expert sharing biostatistical industry trends, best practices, processes and training.

It’s important to track enrollment of a trial over time to make sure accrual goals are met but knowing the enrollment number alone isn’t sufficient to know how a study is progressing. Viewing enrollment visually can provide a quick overview of how enrollment is progressing in the trial as a whole but also how specific sites are performing, whether enrollment goals will be met, and key information about the enrollment population. Below are some graphics that have been valuable for us to keep track of site performance.

Most trials start with a rolling site activation as many factors impact when a site may be activated. Along with this each site may have a different target enrollment for a trial. The below Enrollment Over Time graph takes into account when sites are activated and their target randomization rate to show target rates over time to meet accrual goals along with actual enrollment rates. In addition to overall study status, the graph can be subset to review a particular sites status. In the Overall Enrollment bar graph, a quick overview of how many subjects have been screened, enrolled, and randomized at each site along with the target accrual are shown to quickly see which sites are performing and which sites need additional follow-up.

enrollment metrics

In some studies, it’s important to track sub-groups of enrollment. This can be done by including sub-bars that show what percent of subjects are in each group.



Dropout is a concern in many trials when the sample size guidelines project what is expected and how certain dropout rates will affect the power of the primary analysis. This graph lets us keep track of how we are doing with staying within the pre-specified dropout rates to ensure we aren’t loosing too much power to evaluate the primary endpoint.

study dropout tracking

Tracking enrollment overall and by site can help the study team manage the study and focus their efforts on sites that are lagging behind. Close monitoring of study dropouts is valuable so additional retention strategies can be put in place if needed before the number of dropouts has a detrimental effect on the power of a trial. 

Download: 5 Tips for Conducting Feasibility for a New Clinical Trial

Embracing Open Source as Good Science

Posted by Brook White on Wed, Nov 30, 2016 @ 09:37 AM

Ryan Bailey, Senior Clinical ResearcherRyan Bailey, MA is a Senior Clinical Researcher at Rho.  He has over 10 years of experience conducting multicenter asthma research studies, including theInner City Asthma Consortium (ICAC) and the Community Healthcare for Asthma Management and Prevention of Symptoms (CHAMPS) project. Ryan also coordinates Rho’s Center for Applied Data Visualization, which developsnovel data visualizations and statistical graphics for use in clinical trials.

open source software in clinical researchSharing. It's one of the earliest lessons your parents try to teach you - don't hoard, take turns, be generous. Sharing is a great lesson for life. Sharing is also a driving force behind scientific progress and software development. Science and software rely on communal principles of transparency, knowledge exchange, reproducibility, and mutual benefit.

The practice of open sharing or open sourcing has advanced these fields in several ways:

We also feel strongly that the impetus for open sharing is reflected in Rho's core values - especially team culture, innovation, integrity, and quality. Given our values, and given our role in conducting science and creating software, we've been exploring ways that we can be more active in the so-called "sharing economy" when it comes to our work.

One of the ways we have been fulfilling this goal is to release our statistical and data visualization tools as freely-accessible, open source libraries on GitHub. GitHub is one of the world's largest open source platforms for virtual collaboration and code sharing. GitHub allows users to actively work on their code online, from anywhere, with the opportunity to share and collaborate with other users. As a result, we not only share our code for public use, we also invite feedback, improvements, and expansions of our tools for other uses.

We released our first open source tool - the openFDA Adverse Event Explorer - in June 2015. Now we have 26 team members working on 28 public projects, and that number has been growing rapidly. The libraries and tools we've been sharing have a variety of uses: monitor safety data, track project metrics, visualize data, summarize every data variable for a project, aid with analysis, optimize SAS tools, and explore population data.

Most repositories include examples and wikis that describe the tools and how they can be used. An example of one of these tools, the Population Explorer is shown below.

Interactive Population Explorer

interactive population explorer, clinical trial graphics

Access summary data on study population and subpopulations of interest in real time.

One of over 25 public projects on Rho's GitHub page - available at: https://github.com/RhoInc/PopulationExplorer

Over the next few months, we are going to highlight a few of our different open source tools here on the blog. We invite you to check back/subscribe to learn more about the tools we're making available to the public. We also encourage you to peruse the work for yourself on our GitHub page: https://github.com/RhoInc.

We are excited to be hosting public code and instructional wikis in a format that allows free access and virtual collaboration, and hope that an innovative platform like GitHub will give us a way to share our tools with the world and refine them with community feedback. As science and software increasingly embrace open source code, we are changing the way we develop tools and optimizing the way we do clinical research while staying true to our core purpose and values.

If you have any questions or want to learn more about one of our projects, email us at: graphics@rhoworld.com

Data Visualization: Conference Roundup

Posted by Brook White on Thu, Jun 11, 2015 @ 02:29 PM

The arrival of spring and summer gets us excited not just for warmer weather, but also conference season. This year, we're making an effort to share more about our data visualization tools at conferences throughout our industry. Rho's Center for Applied Data Visualization (ADV) is presenting some exciting work this year, and we want to share some of that with you.

In May, Shane Rosanbalm presented his Sankey Bar Charts at the PharmaSUG conference in Orlando, while fellow ADV member Ryan Bailey presented our Adverse Event Explorer at the Bio-IT World Conference in Boston. We've featured the Sankey Bar Charts and Adverse Event Explorer in previous blog posts, and you can learn more about these tools on our graphics-sharing website (Sankey Bar Chart | Adverse Event Explorer).

On June 4, Jeremy Wildfire, presented at the Pharmaceutical Users Software Exchange (PhUSE) Single Day Event in Chapel Hill, NC. The theme of this event was "Visualizing Clinical Data" - a perfect topic for the ADV. Mr. Wildfire demonstrated several of the ADV's tools, including two new ones we recently released on our graphics-sharing website, an interactive Lab Results Explorer, and an Immunologic Outcomes Explorer that was recently featured in a correspondence in the New England Journal of Medicine.

Immunologic Outcomes Explorer

immunologic outcomes explorer

Next up for the ADV is a presentation at the DIA Annual Meeting in Washington DC on June 18th. In this presentation, Mr. Wildfire will demonstrate an interactive data explorer we created to interface with the FDA's openFDA project. openFDA is an open access portal and API designed to give the public easier access to the vast amount of data the FDA collects on medical devices and drugs. These data include over 3.8 million adverse event (AE) reports. Our interactive openFDA Adverse Event Explorer allows users to explore and compare AE data for all of drugs available in the database in real time.

openFDA Explorer

openFDA explorer

You can access our openFDA tool on our website here: http://graphics.rhoworld.com/OpenFDAExplorer/

One of the best aspects of the conferences is getting to talk with our fellow researchers, investigators, and scientists about our data visualization tools - how our tools can aid in their work, ways the tools can be improved, and ideas for new tool development. These conversations help us refine our existing tools and they inspire our next set of projects.

If you would like to participate in these exciting conversations, we would love to hear from you. To request more information, see the tools in action, share your ideas, or make plans to chat with us at an upcoming conference, contact graphics@rhoworld.com.


Data Visualization: Find your Flow with Sankey Bar Charts

Posted by Brook White on Tue, Jun 02, 2015 @ 10:52 AM

Many clinical trials collect prospective categorical data from participants to chart changes in the study population over time. Common examples would be quality of life questionnaires or risk scales, which provide a quick, standardized assessment of participant outcomes at a given time point.

A popular method for reporting prospective categorical data is to show results in a stacked bar chart. Consider the stacked bar chart below which reports number of risk factors participants exhibited at each of a series of visits.

sankey bar chart

This stacked bar chart is useful for quickly identifying trends in the overall study population - in this case, we can observe an increase in risk factors reported over time - but it does not provide much information about subgroups in the study. In the era of personalized and precision medicine, subgroup analysis is increasingly important for identifying which groups of people are most likely (or least likely) to respond to a particular treatment.

In our example above, we can see that there is a sizable increase in participants reporting 3 risk factors (dark green bar) from the 30-month visit to the 60-month visit. Where did these high-risk factor participants come from? We might assume they came from the group who had previously reported 2 or more risk factors, but the bar graph alone does not answer this question.

One solution is to overlay a Sankey flow diagram to the chart to shed some light on this mystery. Sankey diagrams were popularized by Matthew Henry Phineas Riall Sankey, a 19th-century Irish engineer, who created flow diagrams where the size of the arrow between two nodes is proportional to the magnitude of the flow.

With a Sankey Bar Chart, we can get the following visualization of our data:

sankey bar chart

Now we can see how our data flow between each time point, which helps us identify patterns in our data.  

Let's revisit our question from earlier.  Where did the 29% of high-risk factor participants at 60 months come from?  According to the diagram, some came from the groups reporting 2 and 3 risk factors at 12-months, but more than half came from the groups previously reporting 0 or 1 risk factor - not what we might have expected from just looking at the bar chart.  

For those wanting to really dive into their data, we can provide an interactive version allowing users to explore the chart by selecting individual bar sections or flows and isolating the data for those sections.

sankey bar chart

Like all good data visualizations, the Sankey bar chart is designed to communicate the story behind the data. The bar chart alone tells part of the story, but adding a Sankey overlay provides a richer and more detailed understanding of our data.

Rho's Center for Applied Data Visualization (ADV) specializes in bringing clinical data to life by making charts like these for use in both static and interactive formats. You can visit Rho's Graphics-sharing website to learn more about the Sankey Bar Chart, play with an interactive version of the tool, read a paper on creating Sankey bar charts in SAS (presented by ADV member Shane Rosanbalm at the 2015 PharmaSUG conference), and see some of the other data visualizations the ADV have developed.

If you'd like more information about how Rho can create visualizations for your research project, contact us.

View "Visualizing Multivariate Data" Video

Introducing the Adverse Events Explorer

Posted by Brook White on Tue, Mar 17, 2015 @ 11:32 AM

In the conduct of clinical trials, few tasks are as important as monitoring and reporting adverse events (AEs). The standard method of reporting AEs is to compile detailed listings of every adverse event reported in a study. Medical monitors and regulatory bodies are then tasked with reviewing these listings to monitor patient safety and search for complications and side effects associated with an investigational product.

For studies with large participant enrollment, severe diseases, complex treatments, or long treatment timelines, thousands of AEs may be reported, leading to scores of pages of listings. While comprehensive reporting is necessary, the current approach of creating page after page of listings is inefficient. Worse, this reporting approach creates a risk that that clinically-relevant safety signals will be obscured by the sheer volume of events reported.

Members of Rho's Center for Applied Data Visualization (ADV) recognized an opportunity to improve upon this paradigm by creating an interactive web-based Adverse Event Explorer, which gives study monitors a more intuitive and powerful way to query AE data in real time.

adverse event explorer

The AE Explorer contains all of the information available in standard listings, but our tool adds simple graphics to aid with data comprehension and applies web-based interactivity to give users the ability to search their data in real-time.

The default view of the Explorer is a single-screen display of AEs grouped by System Organ Class. Beside each row, a dot plot portrays the adverse event incidence for each treatment groups, which gives users a simple graphical comparison for their data. An additional graphic can be displayed to indicate the size of the difference between groups and whether the difference is statistically significant. Since humans more readily process graphics than abstract characters like numbers and letters, the graph helps bring the "story" of the data to life.

These graphics provide an intuitive visual component to the traditional AE report, but the real strength of the AE Explorer is in the interactive features that let users query the data in real time.

At a single click, the System Organ Class rows can be expanded to show the nested Preferred Terms underneath − each row with it's own data and dot plot. Users can hover their cursor over the graphic elements to show additional detail about the data points. For any given AE, users can drill down to see a participant-level summary of the underlying data.

Events can be filtered by prevalence so that only AEs above a particular threshold (e.g., 5%) are displayed. Fully-customizable filters can also be created to allow users to filter their data at a click. For instance, users can filter by severity to drop out mild and moderate adverse events to focus only on events classified as severe or life-threatening. A search bar also lets users instantly search the listings for terms of interest (e.g., "headache").

End users consistently report positive experiences when using the Explorer, noting that the tool saves time and gives them improved understanding of the data. We anticipate that the clinical trials industry will increasingly incorporate these types of powerful analytic tools into routine clinical trial management and reporting. In fact, a study-specific instance of our AE Explorer was recently used to report AE data for an article published in the New England Journal of Medicine.

Our AE Explorer improves the reporting and monitoring of AEs by applying data visualizations and contemporary web browsing practices to the traditional process. You can learn more about the AE Explorer and try it out for yourself on Rho's public graphics-sharing website: graphics.rhoworld.com.

View "Visualizing Multivariate Data" Video