A few associates and myself are starting an EMR project (Electronic Medical Records). I have heard talk in the past - and more so lately - about a standard record format - to facilitate the transferring of records when appropriate (HIPAA) from one facility to another. Has anyone seen any information on this?
You can look to HL7 for interoperability between systems (http://www.hl7.org/). Patient demographic information and textual notes can be passed. I've been out of the EMR space too long to know if any standards groups have done anything interesting of late. A standard format that maintains semantic meaning is a really, really difficult problem. See SnoMed (http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html) for one long-running ontology effort -- barely the start of a rich interchange format.
A word of warning from someone who spent several years with an upstart EMR vendor...This is a very hard business to be in. Sales cycles for large health systems literally can take years, and the amount of hand-holding required for smaller medical practices can quickly erode margins. Integration with existing practice management systems is non-standard, even if those vendors claim otherwise. More and more issues abound. I'm not sure that it's a wise space for an unfunded start-up to enter.
I think it's an error to consider HL7 to be a standard in the sense you seem to mean. It is heavily customized and can be quite different from one customer to the next. It's one of those standards with too much flexibility.
I recommend you read the standard (which should take you a while), then try to find a community of developers working with the standard. Ask them for horror stories, and be prepared for what you'll hear.
A month late, but...
The standard to shoot for is definitely HL7. It is used in many fields, so is highly customizable but there is a well defined standard for healthcare. Each message (ACK, DSR MCF), segment (PID, PV1, OBR, MSH, etc), sequence and event type (A08, A12, A36) has a specific meaning regardless of your system of choice.
We haven't had a problem interfacing MiSYS, Statlan, Oacis, Epic, MUSE, GE Centricity/Lastword and others sending DICOM, ADT, PACS information between the systems we have in use. Most of these systems will be set up with an interface engine to tweak messages where needed, so adding a way to filter HL7 messages as they come through to your system, and as they go out to the downstreams, would be a must.
Even if there would be a new "presidential standard" for interoperability, and I would hazard a guess that it will be HL7 anyway, I would build the system with HL7 messaging as this is currently the industry accepted standard.
While solving interoperability, you shouldn't care only about the interchange format, the local storage formats should be standardized also, to simplify the transformation to the interchange format and vice versa.
openEHR is a great format for storage, it is more expressive than HL7 v2, v3 and CDA, so it can be transformed easily to any of those. The specs are open and here: http://openehr.org/programs/specification/releases/1.0.2
For the interchange format, any of HL7 v2, v3 and CDA are good. Also consider CCR and CCD.
http://www.aafp.org/practice-management/health-it/astm.html
If you want to go outside HL7 thinking and are looking for an comprehensive EMR or EHR with a specified record format rather than a record extract message interchange format, then have a look at openEHR, http://www.openehr.org/. The ISO 13606 extract standard is (almost) a subset of openEHR. You will also find open source reference libraries and openEHR implementations of different maturity available in Java, .NET, Ruby, Python, Groovy etc.
Some organisations are also producing HL7 artifacts like CDA as output from openEHR based EHR/EMR systems.
Have a look at the Continuity of Care Record--IIRC, that's what Google Health uses for input. It's not an HL7-family standard (there's a competing HL7-family standard--don't recall what it's called off-top).
There likely will not be a standard medical record format until the government dictates the format of one and requires its use by force of law.
That almost assuredly will not happen without socialized national health care. So in reality zero chance.
its correct answer but i think some add about meaningful use of emr..... Officials Announce ‘Meaningful Use,’ EHR Certification Criteria
Last week, CMS released proposed regulations defining the “meaningful use” of electronic health records, Reuters reports (Wutkowski/Heavey, Reuters, 12/31/09).
In addition, the Office of the National Coordinator for Health IT released an interim final rule describing the required certification standards for EHR technology (Simmons, HealthLeaders Media, 12/31/09).
Under the 2009 federal economic stimulus package, health care providers who demonstrate meaningful use of certified EHRs will qualify for incentive payments through Medicaid and Medicare.
Officials will offer a 60-day public comment period after both regulations are published in the Federal Register on Jan. 13. The interim final rule on EHR certification is scheduled to take effect 30 days after publication (Goedert, Health Data Management, 12/30/09). http://www.myemrstimulus.com/
This is a very hard problem because data collection starts with an MD and the only coding they know (ICD and CPT) is all about billing, not anything likely to be of use between providers (esp. in a form where the MD can be held legally liable). And they hate even that much paperwork.
Add to that the fact that HIPAA dictates that the patient not the provider owns the data. Not that they could understand it or do anything useful with it if they had it.
Good luck. Whatever happens will result from coercion by the govt and be a long long time coming IMHO.
Interestingly the one source of solid medical info turns out to be the VA (because they don't have the adversarial issues of payment and legal liability.) Go figure. That might be a good place to start for a standard with any existing data and some momentum, though. Here's another question with some info.
Related
We need to implement some new functionality for some clients. The functionality is essentially an EULA accept interface for the users. Users will open our app, will be presented with the corresponding EULA (varies from client to client). It needs to be able to store different versions of the EULA for the same client, and it also needs to store which users have accepted which version of the EULA. If a new version is saved, it will be presented to the users the next time they log in.
I've written a document suggesting to add two tables, EULAs and UserAcceptedEULA. That will allow us to store different EULAs and keep track of the accepted ones, current and previous ones.
My problem comes with how some people at the company want to do the implementation. They suggest to use a table ConstantGroups (which contains ConstantGroupID, Timestamp, ClientID and Name) that we use for grouping constants with their values that are stored in another table, e.g.: ConstantGroup would be Quality, and the values would be High, Medium, Low.
To me this is a horrible, incredibly wrong way to do it. They're suggesting it because we already have an endpoint where you pass the ClientID and you get back a string, so it "does what we need".
I wrote the document explaining the whole solution, covering DB changes, APIs needed and UI modifications, but they still don't want to accept it because they thing their way will save us time.
How do I make them understand how horribly wrong they are?
This depends somewhat on your assumptions about "good" design.
Many software folk have adopted the SOLID principles as being "good" (I am one of them). While the original thinking is about object oriented design, I think many apply to databases too.
The first element of that is "Single responsibility". A table should do one thing, and one thing only. Your colleagues are trying to get a single entity to manage different concepts; the Constants table suddenly is responsible for "constants" and "EULA acceptance".
That leads to "Open to extension, closed to change" - if you need to change the implementation of "constants" or "EULAs", you have to untangle the other. So any time you (might) save now will cost you later.
The second principle I like (especially in database design) is the Principle of Least Astonishment. Imagine a new developer joining the team, and having to figure out how EULAs work. They would naturally look for some kind of "EULA" and "Acceptance" tables, and would be astonished to learn that actually, this is managed in a thing called "constants". Any time you save now will be repaid by onboarding new people (or indeed, reminding yourself in 2 years time when you have to fix a bug).
I'm working on a flight data analysis project. the flight data is represented in a tabular format. Each quarter of a second, we have the status of different parameters including turobreactor parameters and avionic parameters. I intend to use an expert system to analyse the flight data in order to detect anomalies during the flight. for example T4 (temperature) shouldn't surpass 750 °C over 30 seconds. Is the expert system architecture appropriate to such task?
Every expert system consists of the knowledge base and the inference engine.
If you are going to to use the expert system architecture:
you have to make sure that you have this knowledge gathered from factual and heuristic knowledge. Those are the rules, mostly consisting of an IF part and a THEN part.
how you will apply this rules, is defined by the inference engine - the problem-solving model, where the common paradigm is chaining of IF-THEN rules (e.g. forward chaining and backward chaining).
Now answering your question, to me your example looks like a specification of a discrete cyber-physical system (Depending on other specifications can be considered hybrid too). A cyber-physical system can also be considered as a state machine which is a system that exists in a limited number of conditions and has forbidden states and progresses from one state to the next according to a fixed set of rules. In addition, if you had possible input and output events in your example, you could design Moore, Mealy machines, Petri Nets, Statecharts of your state machine, given the specifications and then use formal verification techniques to verify it.
I'm starting up looking into doing some machine translation of search queries, and have been trying to think of different ways to rate my translation system between iterations and against other systems. The first thing that comes to mind is getting translations of a set of search terms from mturk from a bunch of people and saying each is valid, or something along those lines, but that would be expensive, and possibly prone to people putting in bad translations.
Now that I'm trying to think of something cheaper or better, I figured I'd ask StackOverflow for ideas, in case there's already some standard available, or someone has tried to find one of these before. Does anyone know, for example, how Google Translate rates various iterations of their system?
There is some information here that might be useful as it provides a basic explanation of the BLEU scoring technique that is often used to measure the quality of an MT system by developers.
The first link provides a basic overview of BLEU and the second points out some problems with BLEU in terms of it's limitations.
http://kv-emptypages.blogspot.com/2010/03/need-for-automated-quality-measurement.html
and
http://kv-emptypages.blogspot.com/2010/03/problems-with-bleu-and-new-translation.html
There is also some very specific pragmatic advice on how to develop a useful Test Set at this link: AsiaOnline.Net site in the November newsletter. I am unable to put this link in as there is a limit of two.
I'd suggest refining your question. There are a great many metrics for machine translation, and it depends on what you're trying to do. In your case, I believe the problem is simply stated as: "Given a set of queries in language L1, how can I measure the quality of the translations into L2, in a web search context?"
This is basically cross-language information retrieval.
What's important to realize here is that you don't actually care about providing the user with the translation of the query: you want to get them the results that they would have gotten from a good translation of the query.
To that end, you can simply measure the discrepancy of the results lists between a gold translation and the result of your system. There are many metrics for rank correlation, set overlap, etc., that you can use. The point is that you need not judge each and every translation, but just evaluate whether the automatic translation gives you the same results as a human translation.
As for people proposing bad translations, you can assess whether the putative gold standard candidates have similar results lists (i.e. given 3 manual translations do they agree in results? If not, use the 2 that most overlap). If so, then these are effectively synonyms from the IR perspective.
In our MT Evaluation we use hLEPOR score (see the slides for details)
Suppose there are two types of messages, QUOTE and TRADE. Both have different fields. For example TRADE has only a single price. QUOTE has both a bid and ask price. I want process messages in time order to do something like the following:
if (QUOTE) {
...
}
if (TRADE) {
...
}
My problem is the two messages are in different formats so I can't get them into the same database table. If I can't get them into the same database table how do I process sequentially? Any ideas for a suitable design?
The answer depends entirely on what you're doing and on where your app plugs into the data streams.
At one extreme, you might merely be answering customer quotes that you're pulling from an API, and basically implementing a cache. In this case two tables are fine.
At the other extreme, you might be monitoring real-time quotes for a high frequency trading platform, in which case the throughput will probably rule out using a database at all (things built around lisp, such as allegrograph, might be more appropriate), except to periodically collect aggregate statistics.
The short answer is, 'not really' For stock market and other time series data a key value store like Berkley DB or Mongo is pretty good. Also, a data format like NetCDF (http://en.wikipedia.org/wiki/NetCDF) will likely serve you better in the long run. It also depends on what kind of access you want and how much time you want to store.
You didn't indicate what you were doing with the data, which should inform your choices of storage more than anything. For example, a high-speed trading application will have different storage tradeoffs than a historical batch processing system (where Hadoop + NetCDF would be great). YMMV
Kdb+/q
Is a very good option for tick data. Used by major banks.
here is the info about that.
You can install a trail version and play with it.
Are there any good technical solutions for extremely long term archiving of data, for example for 25 to 100 years?
Somehow I just don't have a lot of confidence that a SQL 2000 backup file will be usable in court cases or for historians in 25 to 100 years.
This is a customer requirement, not just speculation.
This is comparable to trying to trying to do something useful with a back up for ENIAC or reading Atari Writer wordprocesing files. The hardware doesn't necessarily exist anymore, the storage media is likely corrupt, the professionals for using the technology probably don't exist anymore, etc.
Actually, printing on Acid-free paper is probably a much better solution than any more advanced technological one. It is much more likely that the IT tech of +100 years will be able to high-speed scan and load print than any digital data storage based on 100 year-old media access HW, technology and standards, 100 year-old disk/file format standards and 100 year-old data encoding standards.
Disagree? I've got a whole attic full of vinyl CD's, 8-tracks, cassette tapes, floppy disks (4 different densities!) that argue otherwise. And they are only 20 years old! (OK, the 8-tracks are closer to 30).
The fact is that there is only one data storage & archiving technology that has ever withstood the test of time over 100 years or more and still been cost effectively retrievable, and thats writing/printing on physical media.
My advice? Don't trust any archival strategy until it's been tested, and there's only one that has passed the 100-year test so far.
You'll need to convert to text - perhaps XML.
Then upload it to the cloud, make archival copies etc.
I think you need to pick a multi-modal approach.
If you have the budget: http://www.archives.gov/era/papers/thic-04.html
<joke>Print it.</joke>
script the data into flat files (either one file per table or summarize multiple tables into a file), write them to high end archival CDs. in 100 years they will have to load this data into whatever "database" they have, so so some manual conversion will be necessary, so a nice schema script dump into a single file would help the poor guy trying to read these files and make the proper joins.
EDIT
offer the client a service contract, where you make sure they are up to date with the latest archival technology on a yearly basis. this could be a good thing $
I suggest you consult a specialist company in this field.
You might also be interested in this article:
Strategies for long-term data retention
It might help to speak to one of those companies/organisations
I don't know if anyone reads this thread or not anymore but there is a really good solution for this.
There is a new company called Millenniata, the have a product called M-Disc. The M-Disc is essentially a DVD made out of rock like materials that give it an estimated shelf life of 1,000 years +. You have to have a special DVD burner to burn the DVD's but it is not that expensive. Plus any normal DVD reader can read them. I have a professor at BYU that helped form this company, it is some pretty cool technology. Good Luck.
Link to M-Disc Website