A community of 30,000 US Transcriptionist serving Medical Transcription Industry


Bogus QA Processes - sm


Posted: Feb 26, 2010

A little lesson here in statistics, probability and sampling that might surprise you, so stick with me like a mustard plaster. First, a couple of definitions: "Population" = all the things you want to measure, but can't measure directly, usually for practical reasons. The "population" of an MT's work in a month would consist of ALL the reports she transcribed. "Sample" = some subset of the population that you CAN measure in order to make a statistical inference about the population. So, for instance, we query a SAMPLE of the entire POPULATION of people who eat out at restaurants to find out if they'd like to have snails on the menu and make an inference about the whole population based on the results from the sample. Let's say that we talk to 100 eaters-outers and 46 of them want snails. It would be nice if we could say that "46% of the population want snails", wouldn't it? Well, it turns out that we can't. In fact, we know that it is highly UNLIKELY that the results of querying the entire population will be 46%, like our sample. We know that the population results might be 44%, or perhaps 48%, and so we express our sample results with something called a "confidence interval" to express the level of imprecision in our inference. It would look like this: 46% plus or minus 2%, for instance, to indicate that the best we can say from our sample is that we believe that 44 to 48% of the population wants to eat snails. But wait - we really need to go further and say HOW sure we are that the true population result DOES fall within this specified interval. Are we 90% sure? 95% sure? 99% sure? If I said "46% plus or minus 20%", or 26% to 66% of the population wants snails, I would obviously be much more sure that the population result falls somewhere in that range than if I say "46% plus or minus 1%", or 45% to 47% of the population wants snails. So if you've stuck with me, you will see that: 1. I can't derive a single number from a sample that I can say with confidence represents the result I would get from measuring the whole population. 2. I have to use a confidence interval (the "plus-minus" thingie) to express the range that I think would include the true population measurement. 3. The wider my confidence interval, the more certain I can be that it would include the true population measurement, and the narrower my confidence interval, the less certain I can be. So, if you do, say, 20 reports a day five days a week for four weeks, perhaps you do 400 reports in a month. I take, say, 20 of them for a sample and conclude that your "QA score is 98.3%". Dear me, there seems to be a flaw here, doesn't there? Something missing? Something that might be the CONFIDENCE INTERVAL? All I can REALLY say from my sample is that your QA score is 97.3% plus or minus some percentage, right? There are two ways we can "narrow" the confidence interval: 1. We can take a larger sample. Obviously, the closer we get to sampling the entire population, the less variable or smaller the degree of uncertainty we have about whether the range we're expressing includes the true population measurement. Small samples equal more uncertainty; large samples equal less uncertainty. 2. We can accept a lower confidence level. If I only have to be 90% sure that the confidence interval (the range) contains the true population value, I can use a smaller confidence interval than if I must be 99% sure. So - let's get back to our example. You've transcribed 400 reports in a month (we could use lines also, but this makes the math easier). I want to take a sample of your reports, and I want to be 95% certain that the results from that sample will be within 1% (plus or minus) of the "true" result I'd get from checking ALL your reports. How many of your 400 reports do you think I'd have to take as a sample? 20? 40? 60? ANSWER: 385. That's correct, sports fans. If I want to have 95% certainty that your "QA score" truly represents your performance (across all the work you did) within a range of plus-minus 1%, I will have to check nearly ALL of your 400 reports. So, let's ask the question another way: If I can only sample 40 reports from a population of 400, what will my 95% confidence interval be - in other words, what will be the range around the sample results that I'll have to use in order to be 95% sure that it would include your "true" population result? ANSWER: 14.7%. Your QA score would be, for instance, 97.3% plus-minus 14.7% However, these numbers assume something that probably isn't true, which is that the practical range of QA scores could run from 0 to 100%, and we would certainly hope this isn't so. Let's make a little correction, and say that the typical range of MT scores in our company runs from 95 to 100%. Now let's run our problems again: SAMPLE SIZE: 329 reports (out of 400, remember) for a confidence interval of plus-minus 1%. OR CONFIDENCE INTERVAL: Plus-minus 6.42% for a sample size of 40 reports. But wait - we're not done yet. ALL of the above assumes that the sample IS representative of the population, and it turns out that it is very difficult to come up with samples that aren't biased in one way or another. Many factors can introduce bias into samples - and especially small samples, which we have already seen have their own problems that are purely mathematical. Sample bias falls into two main categories: 1. Systematic bias: Our methods of sampling are flawed in such a way that the bias is "built in", which makes it somewhat predictable at least. Systematic bias is recognized when samples are always "off the mark" in one direction or the other (plus or minus) from the true population result. 2. Random bias: Our methods of sampling are flawed in such a way that accidental factors can skew the sample results in a significant way. There are methods we can use to deal with sample bias, but they are not without cost. First, we can go ahead and DO some full-population QA checks and compare the "true" results with a sample taken from the same population. Second, we can do some things that make the sampling process more representative of the population. For instance, we can recognize that certain types of reports are typically easier (e.g. ER notes versus operative reports, or ESL dictators versus native English dictators, etc.) and weight our sample to reflect the relative percentages of each type of report found in the entire population. So - if Suzie did 400 reports of which 70% were ER notes and 30% were op notes, we would not want a sample of 40 reports that contained 12 ER notes and 28 op notes. We would want 28 ER notes and 12 op notes instead (70% and 30%, respectively). This is called "stratified sampling", and obviously it requires attention to the mix of work that people do and attention to see that the sample is representative of that mix. Another thing we might want to do to protect the integrity of the sample would be to eliminate "outliers" from the sample. "Outliers" are jobs that are "atypical" for reasons beyond the MT's control. For instance...a dictation that had terrible sound quality compared to most. Or...a dictation that represented a worktype that this MT has never done before. (Yes, of course we want to QA these reports, but we do NOT make it part of her monthly QA!)... Are you still sticking with me? We will next address the completely arbitrary "point system" promoted by our old friends, AAMT, or whatever they're calling themselves these days!

Excellent post! Thank you. - A

[ In Reply To ..]
Thank you for putting this into MT terms. I took a statistics class while in school, and the QA percentage thing has driven me nuts from the beginning...

I had a consistent QA score of 99% or higher, then ONE report brought it all down. It was awful, and I can't remember why because it was so awful. Something about the sound quality was terrible, I believe it was very crackly and had static issues, etc. But anyway, there were blanks! There were flags and things too...and it was part of my QA and because there were so FEW reports actually QA'd...that single report brought my score down to below 85% or some such thing.

And I took a pay cut, they decreased my work flow, and I had completely "run out" of work during my shift (the little they were even giving me, that is) for a month before I moved on and found work elsewhere.

On top of that, how easy do you think it is for a company, say one who either does or does not outsource, to take just some of their reports for QA purposes...and then boasting their terrific scores to clients! Believe you me, there are MANY dictators who don't buy their 98% or higher quality scores. And for good reason!

This is exactly the concept I KNEW was correct, only - Kiki

[ In Reply To ..]
I couldn't place my finger on it exactly or put it into words as fantastically as you just did. THANK YOU for posting that.

I don't mind having a percentage of, say 98%, to aspire to and we SHOULD be aiming for that or even 100%. BUT, if we are going to be penalized in any way (pay cut, firing, or just "performance managed" and made to felt inferior" THEN they need to start making it more fair. I always knew it wasn't, and once when I did fail because of ONE horrible report, I was seething when I got the call telling me I needed to "improve." All I could do say nothing much and humbly end the call. Now, if that ever happens to me again, I will have some ammo. I actually printed this out, and I suggest every MT do the same. It might not stop a company if they are using this "system" to weed out some MTs, but it might at least help you to get UI if they try to fire you and you live in a state where UI isn't automatic if you are fired, which is most states, I think (Except CA, you get UI fired or not).

THANK YOU AGAIN!!!!!

How do you know this is the actual way every company does QA? - I could not follow it enough to get past the 1st p

[ In Reply To ..]
I think it is made up, just someone on a rant.

I followed it perfectly well. She didn't say it - Kiki

[ In Reply To ..]
was the way EVERY company did QA. If you are going to comment at all you should really read the entire post before you give up and accuse someone of making something up. It MAKES PERFECT sense.

UI in states fired or not - MT

[ In Reply To ..]
The statistics post was excellent, and the MT QA procedures in my opinion has always been a bogus deal devised by some people that I wonder if they even know statistics or how to measure things accurately (or somewhat accurately) in the first place.

Something to consider is the UI insurance. IT IS NOT AUTOMATIC in every state. If you get fired you CANNOT collect UI insurance if the state legally determines that it was through your fault. In other words, if you are a terrible employee who doesn't know anything to do the job in the first place and were fired because of your incompetence through your own fault, you probably won't get UI. AND . . . if you don't follow company policy and are a terrible employee in that way, you won't get it. The ONLY WAY that you get UI if you are fired is if you were discharged (fired) for unjust cause, meaning that they fired you for no real apparent reason, i.e. something that they pretty much made up to just get rid of you. BUT . . . be aware that the burden of proof is on you, and so, the ethical behavior of your employer has a lot to do with how you are treated.

Maybe that's how it is in your state, but in CALIFORNIA - Kiki

[ In Reply To ..]
you will most definitely get UI if you get fired. My sister used to work for CA state in the unemployment office. She said that almost anyone who gets fired could get UI. For example, she knew of a guy who made several VERY ugly racist remarks, got fired, and then sat on his ridiculous arse for 2 years collecting UI.

She said there are only a few very rare case where it would be denied. To me, that guy above's case should have been denied. But if HE got UI, you can bet your bottom anyone at MQ (in California) could get it.

Huh? - :/

[ In Reply To ..]
Does that last paragraph mean you are coming back with more?

I feel faint . . . - my head is spinning . . .

[ In Reply To ..]
Sorry, too much for me to follow.

It does seem a bit rambling, on and on and on . . . . .

Could you condense this all in a sentence or two of what exactly you are saying in a nutshell?

I think overall, you are saying you really dispute how QA is done on your work because there are just too many variables at play which you feel are not being taken into consideration, thus making the final audit not a true representation of the actual quality of your work?

Or am I over-simplifying? Or did I take a left turn somewhere and just don't get whatever it is that you are saying?

(PS: Too much rambling is aways a bad thing. We are not sitting in class trying to earn a degree here. I found myself halfway through your post thinking: "Let's see, I need milk, bread, eggs, fruit, salad stuff . . . .")

Okay - - the short version

[ In Reply To ..]
1. You have two choices in doing QA - you can check all of the reports (the "population") that an MT has done during the "QA period", or you can check a sample of reports and draw a statistical inference about the population based on the sample.

2. If the sample QA score is 98%, it's very unlikely that the "true" score for the whole population of reports would be 98%. It will be 98%, plus-minus some percent - for instance, 98% plus-minus 2%, or SOMEWHERE between 96% and 100%. Or it could be 98% plus-minus 3%, or 95-100% (can't be over 100% of course). It depends on our sample size and how "sure" we need to be that this range includes the "true" population score.

We cannot say more than this. We CANNOT take a sample, score it, and assign THAT score to the whole population of an MT's work during the QA period. We must express our uncertainty by using a range. That's why you see political polls saying "48% of voters support Candidate Smith, PLUS-MINUS 4%".

3. This "plus-minus range" (96-100%) is the confidence interval, and ALL you can say about the entire batch of work this MT has done during the QA period is that it falls SOMEWHERE in that range.

4. It takes a very large sample (over 300 reports out of 400, for instance) to get a confidence interval that is this small (plus-minus 2%), or precise, in other words. If I use a small sample (say, 40 reports), I can't be as sure of the "closeness" of the sample to the population, and the range where I am confident the true value falls grows (indicating less certainty). Smaller sample = less precision.

So now, thanks to my small sample, I might have to say the QA score is 96% plus-minus 6% (of course it can't be above 100, though)...or 90-100%.

5. All the above assumes you have a truly representative sample, but there are a lot of potential problems with sampling. These problems are exacerbated with small samples, and, contrary to what many think, a "random" sample isn't the answer (in the case of QA) unless a random sample accidentally happens to be representative of the mix of reports in the whole population. To get a representative sample, you must specifically select a certain mix of reports that mirrors the mix of reports in the population - e.g., a stratified sample - and you must specifically exclude "outliers" (jobs with special atypical problems) that would skew the sample QA score.

That's it, up to this point. The AHDI "point scale" is next.

If you'd like to play around with sample sizes, confidence intervals, etc. and try different scenarios with different parameters, there's an online application called the "sample size calculator" (but you can test the effect of changing any parameter).

If you try this, there are two things you should know:

There are three typical "confidence intervals" in statistics - 90%, 95% and 99%. What these mean is:

"I need to be 90% (or 95% or 99%) confident that the population score does fall within the confidence interval - plus-minus range - that I've specified."

The MORE sure you need to be that the plus-minus range DOES include the true population score, the LARGER the confidence interval (range) must be, or the LARGER the sample size must be.

2. Use a "response distribution" of 10%. The default, 50%, means that the possible QA score COULD be 0 to 100%, and we would hope that isn't true. I think 90-100% is a reasonable range of theoretically-likely QA scores, although I've seen worse! When the range of "real world" scores is smaller, the math involved in sample sizing adjusts the needed sample size downward somewhat because the degree of real-world uncertainty is smaller.

In the top section, you can determine sample size for your given parameters. In the bottom (Alternate Scenarios), you can enter different sample sizes and it will use your parameters in the top section to show you the change in the confidence interval with different sample sizes.

http://www.raosoft.com/samplesize.html

Ho hum, long winded way of justifying your low score. nm - improve your work

[ In Reply To ..]
bb

I would not be so harsh. Just wait until it happens to you. - And it will.

[ In Reply To ..]
Eventually out of all your 100% perfect reports, they will pick the ONE you had the most trouble with and find something totally ridiculous to bring your score way down.

It happened to me. That one report out of all of my near perfect reports gave me a score that won me a verbal warning.

My own supervisor, knowing and seeing my work for years upon years, even admitted it was bogus.

Ho hum, short-winded way of justifying the fact - that you blew off the post

[ In Reply To ..]
ASS-U-ME, much?

Bogus - Reminds me of........

[ In Reply To ..]
When I got tired of transcription and took a job in the QA dept. of a large national HMO. We did statistics on "indicators"/population, based on processes different departments wanted to monitor to see if there was a flaw in the quality of patient care in that particular area. Have I lost you yet? For example we had an indicator for a woman who had an abnormal finding on her mammo. The indicator was...(From the time of abnormal finding), i.e. transcribed report received by the ordering physician/or a phone call to that ordering physician from the Radiologist interperting the mammo until the time there was a definitive diagnosis was 48 hours. Make sense? 48 hours of not knowing if they had cancer or not was termed by us as "sleepless nights". In that 48 hours did the patient have repeat views taken, an ultrasound, needle biopsy, etc. Every patient that had an abnormal finding had a copy of the finding sent to the QA dept. and we did chart review to see if the 48 hour criteria was met and if not where did the process fail. At the end of the specified length of time we did statistics to see how many cases met or failed the criteria. We never set a criteria at 100% because everyone is human and it is impossible to be 100% perfect at all times. So if we had 100 patients that had abnormal findings and 2 did not meet the 48 hour criteria we had a 98% pass 2% fail rate. Then someone decided we should do statistical process control. They brought in a guy teach all of us how to do statistical process control, he was a mathmetical engineer. Some of the nurses in this class had Master's degrees. When he got done explaining the process in each class he gave us a homework assignment. None of us knew what the hell he was talking about. We got through the homework when someone finally managed by fluke to come up with the answer and we would all copy it. What you read in the Bogus post is statistical process control and would never be used to do QA by a MTSO. It is done in technical engineering type statistics and I can just about guarantee you there is no way they are going to have one person doing all the QA statistics for an entire MTSO and they are not going to have that many people that could even understand the process. JMHO

You are absolutely correct. - MTSOs would never use that formula.

[ In Reply To ..]
Not the one I did QA for. And I am sure none of the others do.

gotta disagree with you both - regarding sampling

[ In Reply To ..]
The poster gave a great explanation of how larger numbers are derived from a small sample.

This is exactly the method an MTSO would use to calculate accuracy.

Assessing each and every report just isn't possible.

Fantastic math - Jeannieb

[ In Reply To ..]
My son is a middle school math instructor (They don't call them teachers in our district.). I showed him your post. He says you are spot on with the math and stat ideas. It was an interesting post to say the least.

Obviously you didn't get this presentation any better - than you got the one at the HMO.

[ In Reply To ..]
It doesn't matter whether statistical methods are applied to process control, QA, politics or running a lemonade stand. The principles involved in making a statistical inference about a population from a sample are exactly the same.

Your comment about "one person doing all the statistics for an MTSO" - which has nothing to do with anything I said - tells me that you didn't understand the post.

Way too much rambling for me, but here is a thought.... sm - CJ

[ In Reply To ..]
ONE medical misspelled word should count as 2 points off directly from that report. ONE incorrectly used medical term should be 2 points off. The MISUSED term could kill a patient. Leaving off the word NO, could kill a patient. Misspellings, misuse of a term, wrong drug transcribed versus what is dictated, leaving out the word "no" could kill a patient. You should be striving to make 100% on every report. We are MTs and as such we should be dedicated to accuracy/detail. Big deal if you have 19 out of 20 reports correct. What if that 1 report, you had 3 medical errors and killed a patient. Therefore, you fail.


Similar Messages:


S/l Kifus Of Spinous Processes ?
Jan 07, 2011

local anesthesia was infused "over the kiphus of the L2 spinous process" and later "distraction straightened the kiphus somewhat from L1 through L3" Surgery is ORIF for a burst fracture of L2 with pedicle screw/rod fixation. kyphus?  TTA! ...


Bogus Jobs
Aug 24, 2010

Why do bogus companies have help wanted when they are not even a company?  I applied for some of these and there is no such thing of this company anywhere.  ...


Bogus QA Corrections?
Mar 20, 2013

Three or four times in the last two days I have gotten one of those unfriendly FIESA MAJOR ERROR alerts . . . only, that's not my work. I know without a single doubt that I did not make those errors, and seemingly some of these so called 'errors,' would have been flagged by spellcheck. What's up with that? Offering rebuttal in their comment section is pointless, it's just a waste of time since no one really reviews those or sends any feedback. I'm just f ...


Bogus Errors?
May 05, 2014

Has anybody ever had the experience on the EditScript or other platforms with Nuance where you got feedback from Fiesa saying you made errors you didn't make-text either inserted or left out?  I have been charged with CPSE and major errors with random words stuck in places where I know I would've seen and corrected it but when they check the draft it's showing that I missed it.  Maybe I'm going crazy but some of those errors I'm almost positive I didn't ma ...


MedServe Is A Bogus Company....
Jan 18, 2010

I tested with them and she basically said I had the job at 9 cpl, but now has disappeared.  There is no transcription company named MedServe in the United States that I can find. ******* What a waste.  Stay far, far away. ******* DISCLAIMER ***************** Postings of this nature are the poster's personal opinion should not be considered as substantiated by MTStars. Under no circumstances will MTStars be liable for any loss or damage caused by a user’s reliance on in ...


KForce Job Ad By 4C Strong Is Bogus
Mar 23, 2010

It takes you to an employment agency - do a search for medical transcription in all states and 0 results are found.  There is no such job as they said here.   ...


100% 100% 100% 100% Yet They Keep Testing Me With Those Bogus Reports/scripts
Jan 16, 2015

I would so love to know  what this has cost them over the last 4 years of ridiculousness. I mean, really? They claim to want accuracy...check.  They claim to need employees that show up every day...check (despite major medical illnesses), and they say they want to be transparent in their pay plan/structures...  Yet they consume mid management man-hours designing scripts for doctors to read with purposeful inconsistencies for us to catch so that we can be fiesa'd straight off ...


Metastatic Disease Versus "__ And Fondrate" Processes. Sm
Dec 08, 2010

Chest x-ray demonstrated multiple bilateral pulmonary opacities consistent with metastatic disease versus " __ and fondrate" processes.   ...


Bogus E-mail From "Benefits"
May 23, 2013

anybody else get this? ...