基于改进双流细胞神经网络的动作识别(IJMSC-V6-N6-3)

I.J.MathematicalSciencesandComputing,2020,6,15-23PublishedOnlineDecember2020inMECS()DOI:10.5815/ijmsc.2020.06.03Copyright©2020MECSI.J.MathematicalSciencesandComputing,2020,6,15-23ActionRecognitionBasedontheModifiedTwo-streamCNNDanzhenga,HangLia,*,andShoulinYina,*aSoftwareCollege,ShenyangNormalUniversity,Shenyang110034,ChinaCorrespondingAuthor:lihangsoft@163.com;yslinhit@163.comReceived:20October2020;Accepted:03November2020;Published:08December2020Abstract:Humanactionrecognitionisanimportantresearchdirectionincomputervisionareas.Itsmaincontentistosimulatehumanbraintoanalyzeandrecognizehumanactioninvideo.Itusuallyincludesindividualactions,interactionsbetweenpeopleandtheexternalenvironment.Space-timedual-channelneuralnetworkcanrepresentthefeaturesofvideofrombothspatialandtemporalperspectives.Comparedwithotherneuralnetworkmodels,ithasmoreadvantagesinhumanactionrecognition.Inthispaper,aactionrecognitionmethodbasedonimprovedspace-timetwo-channelconvolutionalneuralnetworkisproposed.First,thevideoisdividedintoseveralequallengthnon-overlappingsegments,andaframeimagerepresentingthestaticfeatureofthevideoandastackedopticalflowimagerepresentingthemotionfeaturearesampledatrandompartfromeachsegment.Thenthesetwokindsofimagesareinputintothespatialdomainandthetemporaldomainconvolutionalneuralnetworkrespectivelyforfeatureextraction,andthenthesegmentedfeaturesofeachvideoarefusedinthetwochannelsrespectivelytoobtainthecategorypredictionfeaturesofthespatialdomainandthetemporaldomain.Finally,thevideoactionrecognitionresultsareobtainedbyintegratingthepredictivefeaturesofthetwochannels.Throughexperiments,variousdataenhancementmethodsandtransferlearningschemesarediscussedtosolvetheover-fittingproblemcausedbyinsufficienttrainingsamples,andtheeffectsofdifferentsegmentalnumber,pre-trainingnetwork,segmentalfeaturefusionschemeanddual-channelintegrationstrategyonactionrecognitionperformanceareanalyzed.Theexperimentresultsshowthattheproposedmodelcanbetterlearnthehumanactionfeaturesinacomplexvideoandbetterrecognizetheaction.IndexTerms:Actionrecognition,dual-channel,convolutionalneuralnetwork.1.IntroductionWhenhumanbeingsgetinformationfromtheoutsideworld,visualinformationaccountsfor80%ofthetotalinformationobtainedbyvariousorgans.Thisinformationisofgreatsignificanceforunderstandingthenatureofthings.WiththerapiddevelopmentofmobileInternetandelectronictechnology,mobilephonesandothervideocapturedeviceshavebecomepopularinlargeNumbers,andInternetshortvideoapplicationshavemushroomedlikemushrooms,greatlyreducingthecostofvideoshootingandsharing,whichleadstotheexplosivegrowthofonlinevideoresources.Theseresourcesenrichpeople'slife,butbecauseoftheirhugeamount,varietyandcontent,howtoconductintelligentanalysis,understandingandrecognitionofthesevideodatahasbecomeanurgentchallenge[1-5].Humanactionrecognitionisanimportantresearchdirectioninthefieldofcomputervision.Themajorresearchobjectivesaretosimulatehumanbraintoanalyzeandrecognizehumanactioninvideos,whichusuallyincludesindividualactionsofhumanbeings,interactionsbetweenhumanbeingsandtheoutsideworldandenvironment.Inthetraditionalactionrecognitionmethodsbasedonartificialdesignfeatures,theearlyfeaturesbasedonhumanbodygeometryoractioninformationareonlysuitablefortherecognitionofsimplehumanbodymovementsinsimplescenes,whilethespatio-temporalinterestpointsmethodismoreeffectiveinthecaseofrelativelycomplexbackground.Inthisway,theinterestpointsordensesamplingpointsinspace-timeinthevideoareobtainedfirst,andlocalcharacteristicsarecalculatedbasedonthespace-timechunksaroundthesepoints.Inthisway,thecharacteristicvectordescribingthevideoactioniseventuallyformedbyusingtheclassicfeatureencodingmethodssuchasBagofFeatures(BoF),VLAD(VectorofLocallyAggregatedDescriptors)orFisherVector[6-8].Currently,inthelocalfeature-basedapproach,theactionidentificationmethodbasedonDenseTrajectory(DT)hasobtainedbetteridentificationresultsinmanypublicrealsceneactiondatabases.TheyobtaintheDenseTrajectorybytrackingthedensesamplingpointsineachframeofthevideo,andthencalculatetheTrajectorycharacteristicstodescribetheactioninthevideo.Forexample,Cai[9]usedmulti-viewsupervector(MVSV)asglobaldescriptortocodethefeatureofDenseTrajectory.Wang[10]improvedsetterTrajectory(IDT)featureusingFVencoding.Peng[11]usedBagofVisualWords,(BoVW)tocodespace-timepointofinterestorfeaturesofimproveddensetrajectorycharacteristic.Basedondensetrajectory16ActionRecognitionBasedontheModifiedTwo-streamCNNCopyright©2020MECSI.J.MathematicalSciencesandComputing,2020,6,15-23characteristics,Wang[12]proposedamultistagevideorepresentationmodelMoFAP(MotionFeatures,Atoms,andPhrases),whichcouldrepresentthevisualinformationinahierarchicalmanner.Densetrajectoriescanextractactionalfeatureswithwidercoverageandfinergranularity,butthereisusuallyalargenumberoftrajectoryredundancywhichlimitstherecognitioneffect.Alongwiththedeeplearningsuccessfullyusedinthefieldofspeechandimagerecognitionandsoon,especiallytheConvolutionalneuralnetwork(CNN),avarietyofhumanactionrecognitionm

基于改进双流细胞神经网络的动作识别(IJMSC-V6-N6-3)

免费阅读已结束，点击付费阅读剩下 ... 页

阅读已结束，您可以下载文档离线阅读

集团企业ERP中的资金集中管理

信息技术成绩

基于虚拟现实技术的工业机器人综合控制系统

XXXX--XXXX年房地产调控措施索引

《建设工程价款结算暂行办法》

网络工程师应考复习指南

软件工程师-10个超赞便利的HTML5CSS3框架推荐

中国零售业盈利模式

新能源汽车产业科技项目表doc-附件：

的融合通信解决方案-PowerPointPresent

相关文档

相关搜索