您好,欢迎访问三七文档
当前位置:首页 > 商业/管理/HR > 人事档案/员工关系 > language testing (pma)3-4(2)
Session3ReliabilityandValidityWhenmakingatest,therearetwobasicfactorstoconsider--validityandreliability.Inthischapter,wewilldiscusswhattheseareandhowtheymustbeconsidered.1.DefinitionReliabilityisconcernedwithansweringthequestion“Howmuchofanindividual’stestperformanceisduetomeasurementerror,ortofactorsotherthanthelanguageabilitywewanttomeasure?”andwithminimizingtheeffectsofthesefactorsontestscores.Validityisconcernedwiththequestion“Howmuchofanindividual’stestperformanceisduetothelanguageabilitieswewanttomeasure?Andwithmaximizingtheeffectsoftheseabilitiesontestscores.(Bachman,1990:160-161)validityValiditycanbedefinedasthedegreetowhichatestactuallytestswhatitisintendedtotest.IfthepurposeofatestistotestabilitytocommunicateinEnglish,thenitisvalidifitdoesactuallytestabilitytocommunicate.Ifwhatitistestingisactuallyknowledgeofgrammar,thenitisnotavalidtestfortestingabilitytocommunicate.Thisdefinitionhastwoveryimportantaspects.Thefirstisthatvalidityisamatterofdegree.Testsarenoteithervalidornotvalid.Therearedegreesofvalidity,andsometestsaremorevalidthanothers.Asecondimportantaspectofthisdefinitionisthattestsareonlyvalidorinvalidintermsoftheirintendeduse.Ifatestisintendedtotestreadingability,butitalsotestswriting,thenitmaynotbevalidfortestingreading--butitmaytestreadingandwritingtogether.RelationshipbetweenreliabilityandvalidityInorderforatesttobevalid,itfirstneedstobereliable.Investigationofreliabilityandvaliditycanbeviewedascomplementaryaspectsofidentifying,estimating,andinterpretingdifferentsourcesofvarianceintestscores.FactorsaffectingperformanceofexamineesCommunicativelanguageabilityTESTSCORETestmethodfacetsPersonalattributesRandomfactorsMeaningfulvariancce(directlyrelatedtothepurposeofthetest)MeasurementerrorMeasurementerrorMeasurementerror(errorvariance)(Bachman,1990:165)(Brown,1996:189)Varianceameasureofvariability,SD,1)(22NxXVariance1)(2NxxsStandarddeviation:Z-score:sxxzRelationshipbetweenreliabilityandvalidityValidityandreliabilityhaveacomplicatedrelationship.Ifatestisvalid,itmustalsobereliable.Atestthatgivesdifferentresultsatdifferenttimescannotbevalid.However,itispossibleforatesttobereliablewithoutbeingvalid.Thatis,atestcangivethesameresulttimeaftertimebutnotbemeasuringwhatitwasintendedtomeasure.Reliabilityisconcernedwithdetermininghowmuchofthevarianceintestscoresisreliablevariance,whilevalidityisconcernedwithdeterminingwhatabilitiescontributetothisreliablevariance.Reliabilitysourcesoferrorintestscores,thatis,testscoresthemselvesValiditytestperformanceandfactorsoutsidethetestitselfRelationshipbetweenreliabilityandvalidityCampbell(1959:83)Agreementbetweensimilarmeasuresofthesametrait(e.g.correlationbetweenscoresonparalleltests)Agreementbetweendifferentmeasuresofthesametrait(e.g.correlationbetweenscoresonamultiplechoicetestofgrammarandratingsofgrammaronanoralinterview)ReliabilityValidityValidityConditionsReliabilityDBalancebetweenreliabilityandvalidityprescientificperiodValidityConditionsReliabilityDBalancebetweenreliabilityandvaliditypsychometricperiodValidityConditionsReliabilityDBalancebetweenreliabilityandvalidityintegrative-sociolinguisticperiodValidityConditionsReliabilityDBalancebetweenreliabilityandvalidityValidityConditionsReliabilityDBalancebetweenreliabilityandvalidityValidityConditionsReliabilityRelationshipbetweenreliabilityandvalidityAlthoughitisessentialtoconsiderbothreliabilityandvalidityinthedevelopmentanduseoflanguagetests,thedistinctionbetweenthemmaynotalwaysbeclear.2.ReliabilityClassicaltruescoretheoryetxxxObservedscore(x)Truescore(xt)Errorscore(xe)'xxr=.91:thescoresare91%consistent,orreliable,with9%measurementerror.ThreeapproachestoestimatingreliabilitywithintheclassicaltruescoremodelTest-retestreliabilityhowconsistenttestscoresareovertimeNottooshortbetweenthetwotestsNottoolongbetweenthetwotestsEquivalencetheextenttowhichscoresonalternateformsofatestareequivalentTest-RetestReliabilityDeterminingtest-retestreliabilityisnotasimplematter.Therearevariouswaysoftryingtomeasureit,buteachofthemhaspotentialproblems.Test-retest.Onewayofmeasuringreliabilityistogivethestudentsthesametesttwicetothesamegroupofstudents.However,ifatestisgiventwice,particularlyifthereisnotmuchtimebetweenthetwotests,thestudentsmightdobetterthesecondtimeduetoapracticeeffect.Ontheotherhand,ifthereisalongertimebetweenthetwotests,thepracticeeffectisnotaslikelytobeimportant,butitmaybethatwiththepassageoftime,students'Englishproficiencyhasimproved.Parallelgroups.Anotherwaytobedeterminereliabilityistohavetwoparallelgroupstakethesametest.Theproblemisdeterminingwhetherthetwogroupsaretrulyparallel.Paralleltests.Reliabilitycanalsobemeasuredbygivingparalleltests,thatis,twosimilartestswiththesametypeandnumberofitems,thesameinstructions,etc.Theproblemwiththisapproachisdeterminingwhetherthetwotestsareactuallyparallel.ParalleltestsThetruescoreononetestisequaltothetruescoreontheother.Theerrorvariancesforthetwotestsareequal.Inpractice,wevirtuallyneverhavestrictlyparalleltests,wetreattwotestspara
本文标题:language testing (pma)3-4(2)
链接地址:https://www.777doc.com/doc-4195128 .html