<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>RexDBRexDB</title>
	<atom:link href="http://www.rexdb.org/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rexdb.org</link>
	<description>The open-source research exchange platform</description>
	<lastBuildDate>Tue, 07 May 2013 14:30:27 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
		<item>
		<title>Introduction to the data management lifecycle</title>
		<link>http://www.rexdb.org/introduction-to-the-data-management-lifecycle/</link>
		<comments>http://www.rexdb.org/introduction-to-the-data-management-lifecycle/#comments</comments>
		<pubDate>Fri, 15 Mar 2013 20:01:37 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Management for Scientific Research]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=481</guid>
		<description><![CDATA[&#160; In my introductory blog post, I mentioned that this series would approach scientific data management from both bottom-up and top-down perspectives. The last two posts introducing the small lab and research center described data management issues specific to those&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/introduction-to-the-data-management-lifecycle/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In my <a href="http://www.rexdb.org/good-data-management-as-good-science/" target="_blank">introductory blog post</a>, I mentioned that this series would approach scientific data management from both bottom-up and top-down perspectives. The last two posts introducing the <a href="http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/" target="_blank">small lab</a> and <a href="http://www.rexdb.org/introduction-to-case-study-2-the-research-center/" target="_blank">research center</a> described data management issues specific to those sites. In this post and the next, I’ll introduce important general concepts about scientific data management. This will give us a common language with which to analyze the case studies in greater detail.</p>
<p>We need a general conceptual model and language that help us describe what happens to research data once it is acquired. There are many lifecycle models to choose from. In this series, I&#8217;ll use a preliminary model we have been working with here at Prometheus that is general enough to encompass both of the case studies, but specific enough to provide context for meaningful analysis of data management practices. As I describe each stage of the lifecycle, you may find it helpful to think about what happens to your research data at that stage. For the sake of clarity, I’ve omitted certain steps, such as producing a data management plan and archiving data, that are important but are not formally part of the cycle I will describe. They will be the topics of later posts.</p>
<p><strong>The Data Management Lifecycle</strong></p>
<p>Data undergoes four primary stages: acquisition, curation, use, and repurposing. These activities are connected via data management tasks, quality assurance (QA), exploration, transformation, and expansion (see figure below).<br />
<a href="https://lh3.googleusercontent.com/2KuL-5xGusN03YAkI6nYU5RrDZfexA1dyOTLwQ2FoMfAa2daDAFECjxqhvTDG3mZZWF0YcuqUMiPQ6Cq-V4Cv3AmYoaR69yez4o7Leg4CkuT2DC6-4QQm64T"><img src="https://lh3.googleusercontent.com/2KuL-5xGusN03YAkI6nYU5RrDZfexA1dyOTLwQ2FoMfAa2daDAFECjxqhvTDG3mZZWF0YcuqUMiPQ6Cq-V4Cv3AmYoaR69yez4o7Leg4CkuT2DC6-4QQm64T" alt="" width="498px;" height="309px;" /></a></p>
<p>Data first enters the lifecycle by being acquired. Researchers typically acquire data from multiple sources, which must be centralized and stored to enable QA testing. QA testing helps you ensure that (1) you’ve collected the right data, (2) it isn’t corrupted, and (3) the structure is what you expected it to be.</p>
<p>Once QA is complete, the data is ready to be curated &#8212; that is, organized, cleaned, and monitored for its intended use. Curation is one of the most important steps in the data lifecycle because it improves data quality and makes the data easier to analyze and reuse.</p>
<p>When the data is clean, it’s usually first explored.  Exploration can involve many activities, such as getting familiar with the columns and rows in a data set and seeking to understand the relationships among variables. At some point, exploration crosses over into actual data use, such as conducting statistical analyses, creating reports, and sharing the data with others. Data may spend a long time in this stage.</p>
<p>Data sometimes has a life beyond the project that generated it. It might be used again for the same purpose (data reuse), or it might be transformed for an entirely new purpose (data repurposing). Google is a prime example of data repurposing. When you conduct a Google search, Google doesn’t only use your search string and general location to provide search results, it also uses that data to deliver customized advertising and even to <a href="http://www.google.org/flutrends/us/#US" target="_blank">model flu trends</a>.</p>
<p>Not all data gets repurposed, of course. Sometimes, as data is explored or analyzed, the analyst realizes that more data is needed to answer the original question; or, the results prompt a new question. In either case, one must have access to the data to answer the respecified question. The cycle starts again when this data is acquired or accessed.</p>
<p>Now that we have covered the basic data management lifecycle, we are almost ready to analyze different data management scenarios. In the next post, I&#8217;ll introduce Prometheus Research&#8217;s Integrated Data Management model. By the end of that post, we&#8217;ll have covered the analytical tools needed to critique the data management challenges that the stakeholders in each of our case studies are facing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/introduction-to-the-data-management-lifecycle/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Driving open-source innovation requires staying in tune with emerging research trends</title>
		<link>http://www.rexdb.org/driving-open-source-innovation-requires-staying-in-tune-with-emerging-research-trends/</link>
		<comments>http://www.rexdb.org/driving-open-source-innovation-requires-staying-in-tune-with-emerging-research-trends/#comments</comments>
		<pubDate>Thu, 07 Mar 2013 14:06:28 +0000</pubDate>
		<dc:creator>nara</dc:creator>
				<category><![CDATA[Data Management for Scientific Research]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=471</guid>
		<description><![CDATA[During our weekly lunch seminar, Staff Scientist Frank Farach presented on the four major research trends that impact data management and consequently our decisions about OS RexDB. 1) Decentralization of electronic data capture (EDC). Researchers are moving away from costly&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/driving-open-source-innovation-requires-staying-in-tune-with-emerging-research-trends/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p dir="ltr">During our weekly lunch seminar, Staff Scientist Frank Farach presented on the four major research trends that impact data management and consequently our decisions about <a href="http://www.rexdb.org" target="_blank">OS RexDB</a>.<strong></strong></p>
<p dir="ltr"><strong>1)</strong> <strong>Decentralization of electronic data capture (EDC)</strong>. Researchers are moving away from costly in-house study visits for EDC. Instead, for some types of research, they are equipping participants with convenient and budget-friendly ways to complete study requirements through mobile devices like phones and tablets. However, this trend is not without implementation challenges. Our OS RexDB team will need to carefully evaluate the security and compatibility issues that accompany remote EDC.<strong></strong></p>
<p dir="ltr"><strong>2) Multidimensional data integration</strong>. The days of siloed, single-lab or single-data-type research are long gone. Research data are now collected across multiple sites, studies, and data types (e.g., biospecimens, phenotype, genotype, images), as well as across different time points and subsets of people. Given this variable complexity, creating one-off Access databases or merging hundreds of Excel spreadsheets to create a few data sets is not going to cut it. Luckily, Prometheus is ahead of the curve in this area since we have been solving data integration problems for over a decade.<strong></strong></p>
<p><strong>3) Reproducible research and open science</strong>. This trend encourages transparency in research methods and analyses to facilitate the replication of research findings. The<a href="http://openscienceframework.org/" target="_blank"> Open Science Framework</a>, a web application developed by the <a href="http://centerforopenscience.org/" target="_blank">Center for Open Science</a>, is a useful tool for making your study open-science ready and reproducible. It supports collaborative, scientific workflows and can help researchers and their collaborators better document their study design, materials, and data plans. Even better, users can control what information they want to make public or keep private.</p>
<p><a href="http://www.runmycode.org/CompanionSite/" target="_blank">RunMyCode</a> is another useful way to document study analyses for a published research article. Researchers can create a companion site for a published paper by adding their data and calculations at runmycode.org. Other users can access the information and run the code through the website, making it easy to inspect and verify the findings published in the original article.</p>
<p>OS RexDB technology fits nicely with the Open Science trend because the system will have an audit trail, will make it easy to document meta-data, and will enable quick data exports.<strong></strong></p>
<p><strong>4) Data interoperability and standards</strong>.  With the wide implementation of electronic medical records (EMR) systems, researchers need to be able to &#8220;plug and play&#8221; easily across data systems (e.g. data exchanges between EMR and a clinical research database). One of the main challenges with data interoperability is the lack of a standard information model. One organization that has spearheaded the standardization effort for protocol-driven clinical research isthe Clinical Data Standards Interchange Consortium <a href="http://www.cdisc.org/mission-and-principles" target="_blank">(CDISC)</a>. CDISC provides a common framework for defining the structure, format, and exchange of clinical research data throughout the data lifecycle. The OS RexDB team will need to evaluate the challenges of enabling native support for standards before they facilitate data interoperability with these standards and formats.</p>
<p>Prometheus aims to be deliberate and thoughtful with regards to the features we support in OS RexDB.  Staying abreast of the latest research trends is just one way in which we can identify real-time data management challenges and debate creative ways in which to solve them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/driving-open-source-innovation-requires-staying-in-tune-with-emerging-research-trends/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Brief HTRAF introduction video</title>
		<link>http://www.rexdb.org/brief-htraf-introduction-video/</link>
		<comments>http://www.rexdb.org/brief-htraf-introduction-video/#comments</comments>
		<pubDate>Mon, 18 Feb 2013 15:26:33 +0000</pubDate>
		<dc:creator>nara</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=448</guid>
		<description><![CDATA[&#160; The HyperText Rapid Application Framework (HTRAF) is an efficient way to embed data into HTML pages. HTRAF is unique because it works with HTSQL to fetch data from a database and display it directly on a page. Check out&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/brief-htraf-introduction-video/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>The HyperText Rapid Application Framework (<a href="http://htsql.org/htraf/index.html" target="_blank">HTRAF</a>) is an efficient way to embed data into HTML pages. HTRAF is unique because it works with <a href="http://www.rexdb.org/htsql-primer-for-business-analysts/" target="_blank">HTSQL</a> to fetch data from a database and display it directly on a page. Check out the introduction video below to learn more about how HTRAF enables speedy screen development.</p>
<p>&nbsp;</p>
<p><iframe src="http://player.vimeo.com/video/58994779" frameborder="0" width="500" height="281"></iframe></p>
<p><a href="http://vimeo.com/58994779">HTRAF demo 2013-02-05</a> from <a href="http://vimeo.com/prometheusresearch">Prometheus Research, LLC</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/brief-htraf-introduction-video/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RexDB is on Twitter!</title>
		<link>http://www.rexdb.org/rexdb-is-on-twitter/</link>
		<comments>http://www.rexdb.org/rexdb-is-on-twitter/#comments</comments>
		<pubDate>Thu, 07 Feb 2013 13:47:25 +0000</pubDate>
		<dc:creator>charles</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=426</guid>
		<description><![CDATA[The Open Source RexDB Project is now on Twitter!  Follow us on Twitter and see the latest news, demos and code launches for our Integrated Data Management project. Tweets by @rexdb]]></description>
			<content:encoded><![CDATA[<p>The Open Source RexDB Project is now on <a href="https://twitter.com/rexdb">Twitter</a>!  Follow us on Twitter and see the latest news, demos and code launches for our Integrated Data Management project.</p>
<p><a class="twitter-timeline" href="https://twitter.com/rexdb" data-widget-id="299510084537548803">Tweets by @rexdb</a><br />
<script type="text/javascript">// <![CDATA[
!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");
// ]]&gt;</script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/rexdb-is-on-twitter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prometheus Research awarded $700,000 NIH SBIR grant to improve open source tools for Autism research</title>
		<link>http://www.rexdb.org/prometheus-research-awarded-700000-nih-sbir-grant-to-improve-open-source-tools-for-autism-research/</link>
		<comments>http://www.rexdb.org/prometheus-research-awarded-700000-nih-sbir-grant-to-improve-open-source-tools-for-autism-research/#comments</comments>
		<pubDate>Mon, 04 Feb 2013 16:56:54 +0000</pubDate>
		<dc:creator>Julie H.</dc:creator>
				<category><![CDATA[Awards]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=398</guid>
		<description><![CDATA[&#160; Prometheus announced today that it has been awarded $700,000 by the National Institutes of Health (NIH) Small Business Innovation Research (SBIR) program to extend its Open Source Research Exchange Database (RexDB) for the management of autism spectrum disorders research. The&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/prometheus-research-awarded-700000-nih-sbir-grant-to-improve-open-source-tools-for-autism-research/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>Prometheus announced today that it has been awarded $700,000 by the National Institutes of Health (NIH) Small Business Innovation Research (SBIR) program to extend its Open Source Research Exchange Database (RexDB) for the management of autism spectrum disorders research. The overarching aim of the project is to empower autism investigators to make more effective use of data locally and more efficiently exchange data with the scientific community.  “Existing technologies do not support the need to maintain integrated data management systems in dynamic research environments,” explains Dr. Leon Rozenblit, CEO of Prometheus Research and Principal Investigator on the project.  “Our existing platform, RexDB, is close, but we know from experience that there are a number of essential features still to be developed to close this gap.”</p>
<p>&nbsp;</p>
<p>The Research Exchange Database was initially developed to meet needs of the Simons Foundation Autism Research Initiative (SFARI) for use on their pioneering multi-site Autism study, the Simons Simplex Collection (SSC).  In that study more than 12 universities collected and analyzed data from over 2,700 families with at least one child affected by Autism.  The resulting genetic and phenotypic repository, known as SFARI Base, is underpinned by RexDB technology.  Since the SSC, RexDB has been adopted by leading academic research centers across the US to overcome the unique challenges of mental and behavioral health research.</p>
<p>&nbsp;<br />
Partners who have agreed to collaborate with Prometheus on the grant include the Yale University Child Study Center, the Marcus Autism Center, Weill Cornell Medical College, and the University of Missouri Thompson Center for Autism and Neurodevelopmental Disorders.  The NIH SBIR grant will support two years worth of research and development until the end of 2014.</p>
<p>&nbsp;<br />
“This software will allow researchers to make better use of scarce research dollars, and will help accelerate progress in understanding autism and other mental disorders,” Dr. Rozenblit added.  “We’re thrilled that the NIH has chosen to support the next implementation of RexDB.”<br />
<em></em></p>
<p>&nbsp;</p>
<p><em>This research is supported by the National Institute Of Mental Health of the National Institutes of Health under Award Number R43MH099826. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.</em></p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/prometheus-research-awarded-700000-nih-sbir-grant-to-improve-open-source-tools-for-autism-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HTSQL Primer for Business Analysts</title>
		<link>http://www.rexdb.org/htsql-primer-for-business-analysts/</link>
		<comments>http://www.rexdb.org/htsql-primer-for-business-analysts/#comments</comments>
		<pubDate>Fri, 01 Feb 2013 17:09:57 +0000</pubDate>
		<dc:creator>charles</dc:creator>
				<category><![CDATA[Data Model]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=417</guid>
		<description><![CDATA[RexDB uses a fantastic open-source technology called HTSQL as the basis of its technology platform.  Hyper Text Structured Query Language (HTSQL) is a URI-to-SQL query language that takes a request over HTTP, converts it to a SQL query, executes the&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/htsql-primer-for-business-analysts/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>RexDB uses a fantastic open-source technology called HTSQL as the basis of its technology platform.  Hyper Text Structured Query Language (HTSQL) is a URI-to-SQL query language that takes a request over HTTP, converts it to a SQL query, executes the query against a database, and returns the results in a format best suited for the user agent (CSV, HTML, etc.).  HTSQL is designed for data analysts and other <em>accidental programmers</em> who have complex business inquiries to solve and need a productive tool to write and share database queries.  RexDB&#8217;s applications and data driven screens use HTSQL as a powerful and simple data access layer.</p>
<p>&nbsp;</p>
<p>In this video, Charles gives a brief primer on how to use HTSQL directly to access your data.  The talk is intended for business analysts or researchers, not DBAs or software developers.  The true strength of HTSQL is that in ten minutes you could be querying your data with ease!</p>
<p><iframe src="http://player.vimeo.com/video/59115494" frameborder="0" width="500" height="410"></iframe></p>
<p><a href="http://vimeo.com/59115494">HTSQL Primer for Business Analysts</a> from <a href="http://vimeo.com/prometheusresearch">Prometheus Research, LLC</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/htsql-primer-for-business-analysts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to Case Study #2: The Research Center</title>
		<link>http://www.rexdb.org/introduction-to-case-study-2-the-research-center/</link>
		<comments>http://www.rexdb.org/introduction-to-case-study-2-the-research-center/#comments</comments>
		<pubDate>Fri, 30 Nov 2012 15:26:29 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Management for Scientific Research]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=381</guid>
		<description><![CDATA[&#160; In my last post, we learned about our first case study: a small fictional lab that, like many, has formidable data management challenges. In this post, we’ll get acquainted with the challenges confronting a fictional research center, examining its&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/introduction-to-case-study-2-the-research-center/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In my <a href="http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/" target="_blank">last post</a>, we learned about our first case study: a small fictional lab that, like many, has formidable data management challenges. In this post, we’ll get acquainted with the challenges confronting a fictional research center, examining its research, personnel, data management goals, and tools. But since we’re dealing with such a large operation, let’s first take a look at how it is structured, financed, and governed.</p>
<p><strong>Organization and Governance</strong><br />
The Eating Disorders Research Center (EDRC) was founded 5 years ago with funds from a private, nonprofit eating disorders foundation. Its mission is to reduce the prevalence of eating disorders in the U.S. by advancing translational research and specialty care. The Center occupies an entire floor of a building in a teaching hospital and is affiliated with both the hospital and the psychiatry department of a large medical school. Its operating costs are paid for by a mixture of private donors, the hospital, and indirect costs from the federal research grants its PIs are awarded. Federal research grants and private donations fund its research.</p>
<p>The Board of Trustees, composed of prominent leaders from nonprofits, business, research, and the public, helps set the strategic direction of the center (see figure). The Center’s chief administrator is the Executive Director, who reports to the Board. Three other directors report to the Executive Director. Their titles reflect the primary administrative divisions of the EDRC: Director of Clinical Services, who oversees all clinics and medical staff; Director of Research, who oversees all research activities, PIs, and labs; and an Administrative Director who coordinates all in-house administrative departments and resources. The Center’s organizational ties to the hospital and medical school mean that they can utilize these institutions’ Information Technology (IT), electronic medical records (EMR), medical labs, and neuroimaging core, among other areas.</p>
<p>&nbsp;<br />
<img src="https://lh5.googleusercontent.com/vk7lrPB_BSkPydzRWkx0UH6Ra6ZnAtUDR_TVvzwCYD3ZC4RTwVb8HovE-1x81M6jxiVZk6Q8NEKxfUuR9X47yMIbx0ZK13LkOykFBwBI_6KW5hicEyxC" alt="" width="640px;" height="473px;" /><br />
<strong></strong></p>
<p><strong>Research</strong><br />
According to the EDRC’s charter, its research portfolio spans the translational science continuum of basic, translational, and clinical research. In practice, research groups from these areas collaborate extensively with each other and with groups outside the Center to pursue complex, interdisciplinary research, such as large-scale observational or clinical trials. As such, the research involves many different data types, tools, and data-management practices. The Center has been around long enough now that they have completed several of these large-scale studies and are in the middle of several more. At any one time, the PIs and their colleagues are writing several papers based on these completed studies while coordinating ongoing research. The need to centralize and streamline study recruitment is a pressing issue that is acknowledged at all levels of the organization. Such processes are currently siloed within research divisions.</p>
<p><strong>Personnel</strong><br />
The Research Division of the Center houses 10 principal investigators (PIs) who study different aspects of eating disorders. Each research division (basic, translational, and clinical) is led by one PI and has one data manager associated with it, who supports requests from the 2 to 4 research groups conducting research in that domain. In addition to the research arm, there is a Clinical Services Division composed of clinical providers, including physicians, psychologists, nurses, and health technicians. About half of the PIs in the Research Division provide care through the clinic.</p>
<p><strong>Tools</strong><br />
Lab personnel use many technologies to manage data throughout its lifecycle. They collect and  manage data via a patchwork of paper forms, spreadsheets, cloud-based survey tools, and a locally based web and file server. Clinical providers use <a href="http://www.epic.com/" target="_blank">Epic,</a> the hospital’s EMR system; some of the translational and clinical research staff use it as well, but it is not integrated with their data managements systems. Many of the research groups collect “big data” such as genomics, neuroimaging, and psychophysiological data using a diverse array of sensor hardware tied to equally diverse data processing pipelines.</p>
<p><strong>Data Management Goals</strong><br />
The Center shares the same broad goals as DARP, the small research lab: accuracy, efficiency, security, and shareability. Unlike DARP, however, the Center has a formal center-wide data management plan that specifies its general data policies and governance. Additional goals are to facilitate shared access to data, deal with data requests, integrate clinical and research data, and create patient registries.</p>
<p><strong>Discussion</strong><br />
As you might imagine, the EDRC faces formidable data management challenges. They must acquire, store, organize, clean, reuse, share, and archive their data while integrating across multiple data types, sites, and studies. Research-facing staff in the Center are acutely aware that they could be making much better progress toward this goal. Some of the research needs to securely interface with the EMR system. Because each lab might be running several research projects at the same time, they’re always advertising their studies and screening participants for eligibility both by phone and then in person. The PIs are often busy writing papers and grants and traveling to conferences to present their latest research, which means they’re routinely requesting new data sets from the center’s data manager or research assistants. The lab that primarily conducts clinical trials has a host of federal regulation and guidelines it must follow with respect to data management and reporting.</p>
<p>Clearly, an organization like this has a lot of issues to out, but it will be worth the effort: Because the center has such a high throughput of patients and heterogeneous data, any improvements they can make to their data management processes offer significant gains in research efficiency, data quality, staff morale, and return on investment. In other words, such improvements offer something for everyone in the organization.</p>
<p><strong>What’s Next?</strong><br />
Now that we’ve been introduced to both the small lab and research center, we’re ready to discuss the data management lifecycle. That will prepare us for subsequent posts in which I will critically examine how each organization handles lifecycle phases and transitions &#8212; and how they could be improved.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/introduction-to-case-study-2-the-research-center/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to case study # 1: The small lab</title>
		<link>http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/</link>
		<comments>http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/#comments</comments>
		<pubDate>Thu, 15 Nov 2012 21:42:31 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Management for Scientific Research]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=367</guid>
		<description><![CDATA[&#160; In my inaugural post on RexDB.org, I shared my plan to ground this series on data management for scientific research in common real-world scenarios. In this post, I’ll get this started by introducing a small lab, the first of&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>In my <a href="http://www.rexdb.org/good-data-management-as-good-science/" target="_blank">inaugural post</a> on RexDB.org, I shared my plan to ground this <a href="http://www.rexdb.org/category/data-management-for-scientific-research/" target="_blank">series</a> on data management for scientific research in common real-world scenarios. In this post, I’ll get this started by introducing a small lab, the first of two fictional research organizations that will serve as data management case studies to illustrate the pros and cons of different approaches to data management. Let’s take a brief tour of the small lab’s research, personnel, data management goals, and tools.<br />
<strong></strong></p>
<p>&nbsp;</p>
<p><strong>Research</strong><br />
As its name implies, the Depression and Anxiety Research Program (DARP) lab conducts  basic and interventional research (clinical trials) on mood and anxiety disorders. The lab has completed enrollment and data collection on its first major research project, a 5-year R01 study funded by the National Institutes of Health (NIH), and has just received notice that it has been awarded an R01 to continue the research of the first grant. Both studies were conducted jointly by DARP and a similar lab at another major research university. Although both sites collect data for the two studies, DARP is responsible for managing it. The first study was a clinical trial that evaluated the relative efficacy of two types of psychotherapy among people with co-occurring anxiety and depressive disorders. The new study will follow up on the people who participated in the clinical trial to see which therapy was associated with the best long-term outcomes. Both studies involve many types of data, including self-report, clinical interview, observational, psychophysiological, and genetic data. In addition to the old and new R01s, there is always a steady stream of smaller research studies run by graduate students and postdocs in the lab.</p>
<p>&nbsp;</p>
<p><strong>Personnel</strong><br />
DARP’s organizational chart is shown below. Gretchen, the Principal Investigator and Lab Director, is an Associate Professor of Psychology at<img class="alignright  wp-image-370" src="http://www.rexdb.org/wp-content/uploads/2012/11/Frank_s-Blog-Posts_-Meet-the-Lab-Google-Drive-1.png" alt="" width="394" height="270" /> a major research university. She started the lab 7 years ago and is coming up for tenure review at the end of the year. (Odds are good that she’ll get it.) Her postdoc, June, oversees the day-to-day management of data and works closely with the two RAs. When Gretchen or June needs a data set to address a particular set of research questions, June is usually the one who puts it together within SPSS. Shawn, the undergraduate RA, manages data related to the administration of the study, such as phone screen, recruitment, enrollment, medication, and randomization logs. He does the bulk of the data entry and makes sure everyone else gives him complete data in a timely manner. Elaine is a second-year graduate student who is paid to work half-time as a graduate RA. Her primary responsibility is to make sure the data collected on the old R01 are clean. Both Elaine and Shawn, as well as the two remaining graduate students, Sam and Nancy, are involved in the massive data-cleaning effort and in doing phone screens to recruit subjects.</p>
<p>&nbsp;</p>
<p><strong>Tools</strong><br />
DARP is transitioning from the set of tools they used to manage data for the old R01 and those they plan to use for the new R01. For the old study, they collected most research data on paper, which then Shawn entered directly into SPSS files stored on a local file server. He has also been managing all of the study management data through a constellation of Excel spreadsheets whose idiosyncrasies are known only to him. On the new study, they plan to use electronic data capture (EDC) methods wherever possible. They are currently evaluating several web-based solutions that we’ll discuss at a later point.</p>
<p>&nbsp;</p>
<p><strong>Data Management Goals</strong><br />
Gretchen’s primary concern is with the accuracy, efficiency, security, and shareability of her data: accuracy, because she values scientific validity and knows that errors can lead to retractions; efficiency, because she doesn’t want data issues to get in the way of doing research or writing papers; security, because her lab collects sensitive psychiatric data that must remain private; and shareability, because she believes that data should be shared with other colleagues or research repositories in the interest of advancing science. The lab has done a good job keeping the data secure, but they have struggled with accuracy, efficiency, and shareability.</p>
<p>&nbsp;</p>
<p><strong>Discussion</strong><br />
DARP is a small lab involved in large-scale, complex research. Although their data management goals are clear and well-intentioned, they have much room for improvement in their approach to achieving them (and they know it). Fortunately, they are in a good position to reevaluate their approach as they get the new R01 started and prepare to write their first papers based on the old R01 data. The lab could improve their data management situation considerably by (1) centralizing the data from their recent clinical trial and forthcoming longitudinal study in an open-source relational database; (2) implementing data-cleaning processes that can be applied automatically to incoming data; and (3) writing and sharing a lab-wide data management plan. I’ll elaborate on these and other recommendations in future posts, as we help members of DARP find better ways to meet their data-management goals.</p>
<p>In the next post, I’ll introduce you to a large research center struggling to manage data across several labs, studies, and sites. From there, we’ll move on to an examination of different facets of data management at the lab and center.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/introduction-to-case-study-1-the-small-lab/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Auto-generated screens in OS RexDB</title>
		<link>http://www.rexdb.org/auto-generated-screens-in-os-rexdb/</link>
		<comments>http://www.rexdb.org/auto-generated-screens-in-os-rexdb/#comments</comments>
		<pubDate>Thu, 25 Oct 2012 16:29:52 +0000</pubDate>
		<dc:creator>henry agnew</dc:creator>
				<category><![CDATA[Architecture]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=340</guid>
		<description><![CDATA[&#160; RexDB will use a custom framework to automatically generate screens for data display and editing. We put together this diagram showing how it works. Our auto-generated HTML and Javascript screens will depend on DBGUI (Database graphical user interface), which&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/auto-generated-screens-in-os-rexdb/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>RexDB will use a custom framework to automatically generate screens for data display and editing. We put together this diagram showing how it works.<br />
Our auto-generated HTML and Javascript screens will depend on DBGUI (Database graphical user interface), which contains a library of standard UI elements. DBGUI is the intermediary between the screen and the HTSQL catalog, which is generated from the database schema and allows the screens to rapidly execute queries.<br />
DBGUI will also rely on a set of standard screen templates. Each screen&#8217;s layout and functionality will be determined by a template configuration file, which can be easily updated by a designer or analyst at Prometheus. These configuration files may also contain custom HTML and Javascript screen elements.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p><a href="http://www.rexdb.org/wp-content/uploads/2012/10/Untitled-document-Google-Drive-1.png"><img class="alignnone  wp-image-342" style="border: 0px none;" src="http://www.rexdb.org/wp-content/uploads/2012/10/Untitled-document-Google-Drive-1.png" alt="" width="838" height="661" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/auto-generated-screens-in-os-rexdb/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Good Data Management as Good Science</title>
		<link>http://www.rexdb.org/good-data-management-as-good-science/</link>
		<comments>http://www.rexdb.org/good-data-management-as-good-science/#comments</comments>
		<pubDate>Wed, 24 Oct 2012 15:28:26 +0000</pubDate>
		<dc:creator>Frank</dc:creator>
				<category><![CDATA[Data Management for Scientific Research]]></category>

		<guid isPermaLink="false">http://www.rexdb.org/?p=322</guid>
		<description><![CDATA[&#160; My name is Frank Farach, and I recently joined Prometheus Research as a Staff Scientist. I’m thrilled to be blogging here about scientific data management. In my 12-year career as a clinical psychology researcher and statistical consultant at several&#8230;<p class="more-link-p"><a class="more-link" href="http://www.rexdb.org/good-data-management-as-good-science/">Read more &#8594;</a></p>]]></description>
			<content:encoded><![CDATA[<p>&nbsp;</p>
<p>My name is Frank Farach, and I recently joined Prometheus Research as a Staff Scientist. I’m thrilled to be blogging here about scientific data management. In my 12-year career as a clinical psychology researcher and statistical consultant at several top research institutions, I’ve learned that data management practices can profoundly affect the quality, efficiency, and utility of research. In fact, this realization played an important role in my joining Prometheus Research, an organization of talented folks who share my passion for improving research through better data management practices and tools, such as open-source RexDB.</p>
<p>&nbsp;</p>
<p>Data management may be one of the least liked and most overlooked aspects of research, but it is integral to good science. And just like good science, it isn’t easy to do. Part of the problem is the big gap between the recommendations of best-practice guidelines for data management and the actual management of data. The “how” does not always accompany the “what.” On top of this, guidelines often focus on each stage of data management in isolation, which can miss systemic problems that emerge across multiple phases of data management. Also missing is a discussion of human and organizational factors in data management. (Hey, I’m a psychologist, what do you expect?)</p>
<p>&nbsp;</p>
<p>In future posts, I plan to directly address some of these gaps between data management theory and practice. Rather than adopt a strictly top-down approach to data management guidance, I’ll first consider the needs of investigators and their stakeholders under common data management scenarios, paying particular attention to how data is actually being managed, warts and all. Once we understand the unique goals and constraints operating in each scenario, we’ll evaluate the pros and cons of different data management practices and discuss how open-source RexDB can be used to adopt the best approaches.</p>
<p>&nbsp;</p>
<p>In the next post, we’ll dive right in by learning about the data-management goals of a typical research group struggling to manage data across several research studies.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rexdb.org/good-data-management-as-good-science/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
