¤è¨¥¤ÀÃþ¬O®Ú¾Ú¬Y»yµ¤ù¬q§P©w¨ä»¡¸Ü¤H©ÒÄݤ訥¤ù°Ïªº¬ã§P¤èªk¡A¤]¬O¦D¨Æ§Þ³Nªº«n²Õ¦¨³¡¤À{1}¡C¤½¦w¾÷Ãö°»¬d¯}®×¦b©|¥¼´x´¤¥Ç¸o¶ûºÃ¤H¨¥÷ªº±¡§Î¤U¡A¨Ò¦p¹q«H¶BÄF¡B®£À~ÄÌÂZµ¥®×¥ó¤¤¨Ì¾Ú¥Ç¸o¶ûºÃ¤H¤è¨¥§P©w¨ä©ÒÄÝÄy³e¡A¥i¦³®ÄÁY¤p°»¬d½d³ò¡AÂê©w¥Ç¸o¶ûºÃ¤H¨¥÷¡C ¤è¨¥¦Û°Ê¤ÀÃþ¤èªk¸û¾Ì¸gÅ窺¶Ç²Î¤H¤u¿ëÃѤèªk¡A¨ã¦³¯Ó®Éµu¡Bºë«×°ª¡B«ÈÆ[©Ê±jµ¥Àu¶Õ¡C¥Ø«e§Q¥Î°ò¤_±öº¸ÀW²vËÃШt¼Æ{2}¡B½u©Ê·Pª¾¹w´ú¨t¼Æ{3}µ¥©³¼hÁn¾Ç¯S©ºªº»yºØÃѧO¼Ò«¬¶i¦æ»yºØÃѧO¡B¤è¨¥¤ÀÃþµ¥¬ÛÃö¬ã¨s¤wÁͤ_§¹µ½¡C¤åÄm{4}³q¹L¦Û²Õ´¯S©º¬M®gºôµ¸¡]self-organizing feature map, SOM¡^¹ï±öº¸ËÃШt¼Æ¡]Mel-scale frequency cepstral coefficients, MFCC¡^¶i¦æµLºÊ·þ»EÃþ¡A¨Ã¨Ï¥Î¤ä«ù¦V¶q¾÷¡]support vector machines, SVM¡^§¹¦¨¤F¤è¨¥¿ëÃѤÀÃþ¡C¤åÄm{5}«h°ò¤_¦V¶qªÅ¶¡¼Ò«¬¡]vector space model, VSM¡^¹ï¤À³Î«áªºÁn¾Ç³æ¤¸¶i¦æ¯S©ºªí¥Ü¨Ã¥Î¤_¼Ò«¬°V½m¡A³Ì²×±o¨ìÀu¤_¼Ðã¼Æ¾Ú¶°ªº¤ÀÃþ®ÄªG¡C¬°§JªAµu®É¤è¨¥ÃѧOºë«×§Cªº°ÝÃD¡A¤åÄm{6}§Q¥Î²`«×¾Ç²ß¤èªk±o¨ì«H®§ªí©º¯à¤O¸û¦nªº²~ÀV¯S©º¡Cªñ¦~¨Ó¸g¨å²`«×¾Ç²ß¼Ò«¬³Q¼sªxÀ³¥Î¤_»yµ¤ÀÃþ¥ô°È¤¤¡A¨ä¥Dn°^Äm¬O¬°ì©l¯S©º´£¨Ñ¤@ºØ°ª¼h¯S©ºªí¥Ü¡A¥H³Ð«Ø¤@Ó§ó®e©ö°Ï¤Àªº¯S©ºªÅ¶¡¡A¨Ã»{¬°¦b¸g¹L²`«×¯«¸gºôµ¸³B²z«áªº¯S©ºªÅ¶¡¤¤¡A¥un¹B¥Î²L¼hªºsoftmax¤ÀÃþ¾¹´N¯à¹F¨ì«Ü¦nªºÃѧO®ÄªG¡C°w¹ï»yµ¨ã¦³®É§Ç¯S©ºÄÝ©Ê¡A¤åÄm{7}¤À§O¨Ï¥Î´`Àô¯«¸gºôµ¸¡]recurrent neural network, RNN¡^»Pªøµu´Á°O¾Ðºôµ¸¡]long short-term memory, LSTM¡^¹ê²{¤F°ò¤_°Ï¤Àµüªº¤è¨¥¤ÀÃþ¡C¦P®É¤åÄm{8}¦b¹êÅ礤¤]µo²{¡A¨Ï¥Î©³¼hÁn¾Ç¯S©º·|¥á¥¢³¡¤À¯S©º«H®§¡A¦]¦¹ªö¥Î¤F°ò¤_µ¯À§Ç¦Cªº¾Ç²ß¤èªkPTN¹ê²{»yºØ¤ÀÃþ¡A¨Ã¨ú±oÀu¤_LSTMªº¤ÀÃþºë«×¡C¤åÄm{9}°w¹ï¤è¨¥¤ÀÃþ¥ô°È¡A³q¹LMatConvNet¤u¨ã½cºc«Ø¨÷¿n¯«¸gºôµ¸¡]convolutional neural net-works, CNN¡^¡A§Q¥Î³æµ¸`¤GºûÁnÃй϶i¦æ°V½mÅçÃÒ¡A¹ï¦¿Ä¬¤è¨¥¹ê²{¤F²³æ¤ÀÃþ¥ô°È¡C¤Wz¤åÄmªº¬ã¨s¦¨ªGªí©ú¡A´M§ä¦X¾Aªº²`«×¾Ç²ß¼Ò«¬®Ø¬[¡Aºc«Ø°ò¤_ºÝ¨ìºÝªº¤è¨¥¤ÀÃþ²`«×Án¾Ç¼Ò«¬¨ã¦³¤@©wªº¥i¦æ©Ê»P¬ã¨s»ùÈ¡C ¤£¦Pºôµ¸¼Ò«¬¤UªºÁn¾Ç«H®§ªí©º¬O¼vÅT¤è¨¥¤ÀÃþÀ³¥ÎÃѧOã½T²vªº«n¦]¯À¤§¤@¡C¥»¤å±N¥Ø«e±`¥Îºôµ¸¼Ò«¬µ²ºcSOM¡BRNN¡BLSTM¡BCNN¤À§O§Q¥Îpython»y¨¥¦bTensorflow¥¥x¶i¦æ¹ê²{¡A¨Ãºc«Ø¦¿Ä¬¬Ù¤º¤è¨¥®w¥Î¤_¼Ò«¬°V½m¡BÅçÃÒ»P´ú¸Õ¡C³Ì«á¶i¦æ¤£¦P¼Ò«¬¶¡ªº°V½m¶°Àu¤Æ¡Bºôµ¸½Õ°Ñ¡B¤ÀÃþ©Ê¯à¹ï¤ñ¤ÀªR¹êÅç¡AÅçÃҤ訥ÃѧOºôµ¸¼Ò«¬ªºÁn¾Ç«H®§ªí©º¯à¤O¡A¬°¥¼¨Ó¶}µo«Ø³]¤è¨¥¦Û°Ê§P©w¨t²Î´£¨Ñ¬ã¨s°ò¦¡C 1Án¾Ç«Ø¼Òªº±`¥Îºôµ¸µ²ºc ¦b»yµÃѧOªºÁn¾Ç«Ø¼Ò»â°ì¡A¶Ç²Î¯«¸gºôµ¸GMM-HMM¼Ò«¬{10}Áö¥i³q¹LÂX´V¤è¦¡Àò±oªø®É¯S©º«H®§¡A¦ýÂX´V¯à¤O¦³¥B¤£¯à¾Ç²ß²`¼h«D½u©Ê¯S©º¡CÀHµÛ²`«×¾Ç²ßºôµ¸ªº¿³°_¡A¾ÇªÌHintonº¥ý±N²`«×¯«¸gºôµ¸DNN-HMMÀ³¥Î¤_»yµ¾Ç¬ì»â°ì{11}¡A¨ÃÀò±oÃѧO©Ê¯àªºÅãµÛ´£¤É¡C±µµÛ¡A¬°¹w´úªø®É§Ç¦C»yµ¼Æ¾Ú¡A¦s¦b¦ÛÀô³s±µªºRNN³Q´£¥X¨Ã¦¨¥\¹B¥Î¤_Án¾Ç«Ø¼Ò¤W{12}¡A¦ý¸Ó¼Ò«¬¥Ñ¤_ªø®É¨Ì¿à°O¾Ð°ÝÃD©ö¤_²£¥Í±è«×®ø¥¢¡B±è«×Ãz¬µ°ÝÃD¡C°ò¤_¦¹¡A¾ÇªÌGonzalezµ¥{13}´£¥X«Ø¥ßRNNÅÜÅé¡A§YLSTMÃѧO¤è¨¥¡A¨ÏÃѧO©Ê¯à´£¤É¤F15%¥H¤W¡C¦Ó¦b»yµÃѧO»â°ì¡ALSTM-HMM¨ú¥NDNN-HMM¼Ò«¬¦¨¬°³Ì¼sªxÀ³¥Îªºµ²ºc¡C¦P®É¡ACNN¤]³Q¦¨¥\À³¥Î¤_»yµÃѧO»Pµu®É»yºØÃѧO{14-15}¡CCNN¤£¦P§ï¶i¼Ò«¬VGG-Net¡BGoogleNet©MResNetªº´£¥X¡A³£¬°CNN¦b¤è¨¥ÃѧO¤W´£¨Ñ¤F¬ã¨s¤è¦V¡C 1.1 RNN RNN¬O¤@ºØ¦s¦b¡§¤º³¡»¼Âk¡¨ªº¯«¸gºôµ¸¡A¦p¹Ï1©Ò¥Ü¡C¦b¤è¨¥«H®§ªí©º¹Lµ{¤¤¡ADNN»Ýn³q¹LÂX´Vªº¤è¦¡¨Óªí¹F»yµªº¤W¤U¤å«H®§¡Aµ¡ªø¤@¯ë¬O©T©wªº¡A¦ÓRNN¥i¥H±N¤W¤@´V§@¬°»²§U»P¥»´V«H®§¤@°_¦bºôµ¸¤¤¾Ç²ß¡C »yµ«H®§¦b¬y°Ê¹Lµ{¤¤¦s¦b¤£©w¦Vªº´`Àô¡A½T«O¤Fºôµ¸¦³¿ï¾Ü¦a¦sÀx«H®§¤º³¡ª¬ºA¡C¥Ñ¤_RNNºôµ¸¨C¨B¦s¦b§ó·s¡A¦]¦¹¯à°÷¹ïªø®É¶¡»yµ¤ù¬q¶i¦æ«Ø¼Ò¡C»Ýnª`·Nªº¬O¡A¸`ÂI¼Æ¶qªº¼W¥[¥i¥H´£¤É¼Ò«¬ªí¹F¯à¤O¡A¦ý¬O¦P®É¤]·|±a¨Ó¹LÀÀ¦X¥H¤Îpºâ®Ä²v§C¤Uµ¥¯ÊÂI¡A¦]¦¹¿ï¾Ü¦X¾Aªº¸`ÂIӼƤ]¬OÁn¾Ç«Ø¼ÒªºÃöÁäÂI¤§¤@¡C 1.2 LSTM LSTMºâªk¬°¸Ñ¨MRNNºâªk¦bºôµ¸°V½m¹Lµ{¤¤¦]¦¬ÀÄ°ÝÃD¾ÉPªº±è«×®ø¥¢©M±è«×Ãz¬µ¼W¥[¤Fª¬ºA¹LÂo¡C¦p¹Ï2©Ò¥Ü¡A¤@Ó§¹¾ãªºLSTMµ²ºc¥i¥H²z¸Ñ¬°¤@Ó±q¥ª¨ì¥kªº«H®§¦V¶q¬y¡A¥]¬A¦b®É¶¡¨Bªøt³Bªº¿é¤J¦V¶qXt¡B®É¶¡¨Bªøt-1³B¤§«eªºÁôÂꬺA¦V¶qht-1©M¤§«e°O¾Ð³æ¤¸ª¬ºA¦b®É¶¡¨Bªøt-1³Bªº½s½X¦V¶qCt-1¡C»P¶Ç²ÎªºRNN¬Û¤ñ¡ALSTM¸Ñ½¢¤FÁôÂꬺAht©M¦sÀx³æ¤¸Ct¡A¨Ï¤º¦s®e¶q½¿¡A¨Ã¤¹³\ºôµ¸³q¹L³Ð«Ø¤@Ó½ÆÂøªº¤º¦s¨Ó¾Ç²ß»P³B²z§óªø®É¶¡¶¡¹j¤ºªº¿é¤J«H®§¡C 1.3 CNN µÀWªºÀWÃйϵ²ºc¯S©º¯à°÷¨}¦n¦aªí©ºÁnµ³õ´º¼Æ¾Ú¡A¨Ã§@¬°²`«×¾Ç²ß¼Ò«¬ªº¿é¤J¡CCNN°_·½¤_¦h¼h·Pª¾¾÷ºôµ¸¡]multilayer perceptron, MLP¡^¡A§Q¥Î¤F¨÷¿n¹Bºâ©M¦À¤Æªº«ä·Q¡A³q¹Lµ}²¨³s±µ¡BÅvȦ@¨É©M¦³¿ï¾Üªº®ÉªÅ¤l©â¼Ë³B²z¡A«ö·Ó¿é¤J¼h¡B¨÷¿n¼h¡B¦À¤Æ¼h¡B¥þ³s³q¼h¡B¿é¥X¼hªº¶¶§Ç§Î¦¨¼Ò«¬¬[ºc¡A¦p¹Ï3©Ò¥Ü¡C §Q¥Î¨÷¿n¦b®É¶¡ªÅ¶¡ªº¥²¾¤£ÅܩʧJªA¤è¨¥«H¸¹ªº¦h¼Ë©Ê¬O¥i¦æªº¡CµM¦Ó¡A¥Ø«e¦b¤è¨¥¤ÀÃþ¬ã¨s»â°ì¡ACNN¬[ºcªº°ò¥»ºâªk¸û¨Ì¿à¼Æ¾Ú¶°°V½m¶qªº¤j¤p¡C 2¹êÅç ¹êÅ纥ýºc«Ø¥Î¤_°V½m¾Ç²ßªº¦¿Ä¬¤è¨¥¼Ë¥»®w¡A±µµÛ³]p¹êÅç¤è®×¨Ã«Ø¥ß©Ê¯àµû»ù¼Ðã¡A³Ì«á³q¹L¤£¦P¾Ç²ßºôµ¸¼Ò«¬ªº°Ñ¼Æ³]¸m¨Ã¶}®i©Ê¯àµû»ù¡A«õ±¸¨ã¦³¤è¨¥¯S©ºªº»yµ¼Æ¦r«H¸¹µ²ºc¯SÂI¡A«Ø¥ß°ò¤_²`«×¾Ç²ßªºÁn¾Ç«Ø¼Ò¤èªk¡C 2.1¤è¨¥´ú¸Õ¶° ¤è¨¥®wºc«Øªö¥Î±q¤w¦³¸ê·½¤¤«õ±¸²Å¦X°V½mn¨Dªº»yµ»y®Æ¼Æ¾Ú¶°ªº¤è®×¡Aºc«Ø¦¿Ä¬¬Ù¤º¤è¨¥¼Æ¾Ú¶°¡C¦¿Ä¬¤è¨¥°Ï¤jP¤À¬°¦¿²a©x¸Ü¡B§d»y»P¤¤ì©x¸Ü¤TӤ訥°Ï¡C¨ä¤¤¡A¤¤ì©x¸Üªº¤À¥¬°Ï°ì¤jP¦ì¤_®}¦{¡B±J¾E¥_³¡»P³s¶³´ä¥_³¡¡F§d»y¤À¥¬°Ï°ì¤jP¦ì¤_Ĭ¿ü±`¦a°Ï¡B«n³q«n³¡¡B®õ¦{¹t¦¿¡BÂí¦¿¤¦¶§¡F¦¿²a©x¸Ü«h¤jP¤À¥¬¤_¦¿Ä¬¬Ù¨ä§E¿¤¥«°Ï¡C§d»y¤S¥i²Ó¤À¬°¤Ó´ò¤ù©M«Å¦{¤ù¡A¦b¦¿Ä¬¬ÙÁҰϤº¥Dn¥Ñ¤Ó´ò¤ùªº¬s³®¤p¤ù»PĬº¹Å¤p¤ù²Õ¦¨¡A¦¿²a©x¸Ü¥i²Ó¤À¬°®õ¦p¤ù»P¬x±_¤ù¡C¥Ø«e¤w¦³°w¹ï¥þ°ê¤è¨¥¤j¤À°Ï¦Ó³]pªº¤½¶}¼Æ¾Ú¶°¡A¦ý¹ê½î¹Lµ{¤¤µo²{³¡¤À¯S©w¦a°Ï¤´¹F¤£¨ì¶}®i²`«×¾Ç²ß°V½m©Ò»Ý¼Æ¾Ú¶q¡C¦P®É¡A¼Æ¾Ú¶°¦s¦b¼ÐñªþµÛ¤£§¹¾ã¡A²Ó¤À«×¤£¨¬µ¥°ÝÃD¡C°ò¤_¦¹¡A¹êÅ礤º¥ý³]p¥Ñ¦U¦a°Ï¥Nªí©Ê«U»y²Õ¦¨ªº¤å¥»¡A¨Ã¹ï¤Wz¤è¨¥°Ï¤º105Ó»¡¸Ü¤H«ö·Ó¥»¦a°Ï«U»y¤å¥»ªº¤fz»y®Æ¶i¦æ¿ýµªö¶°¡C¦P®É±N¤¤¤å¤è¨¥»yµÃѧO¼Æ¾Ú®w¡]King-ASR-M-004¡^¡B¡m¤è¨¥¦¿Ä¬¡P¶mµ±y´¡nµøÀW¸ê®Æ§@¬°¸É¥R»y®Æ¡C©Ò¦³»y®Æ¼Ë¥»§¡¸g¹L°¾¸¡B°Ñ¼Æ³W¾ã¡BÀRÀq¬q¤Á°£ªº¹w³B²z¡A®æ¦¡¬°wav®æ¦¡¡Aªö¼Ë²v11025 Hz¡Aªö¼Ë¦ì¼Æ16 bit¡A®Éªø10 s¡A¦@p±o¨ì¦¿Ä¬¤è¨¥°ò¦»y®Æ¼Ë¥»6036¨Ò¡C¦P®É¡A®Ú¾Ú»¡¸Ü¤H©ÒÄݦa°Ï¹ï¨C¨Ò»y®Æ¼Ë¥»²K¥[ÂkÄݦa¡]¦@5Ãþ¡^¼Ðñ¡C ¡]¹Ï²¤¡^ ¹Ï1¡@RNNºôµ¸¼Ò«¬µ²ºc Fig.1 RNN network model ¡]¹Ï²¤¡^ ¹Ï2¡@LSTMºôµ¸¼Ò«¬µ²ºc Fig.2 LSTM network model ¡]¹Ï²¤¡^ ¹Ï3¡@CNNºôµ¸¼Ò«¬¬[ºc Fig.3 CNN network model ªí1¬°²Õ¦¨ªö¶°¤å¥»ªº¦U¦a°Ï³¡¤À¨å«¬«U»y¡A¨ä¤¤¦U¦a«U»yªö¥Îº~»yª`µ¤èªk¡]¾î¦¡¡^¡C ¹êÅ礤«ö·Ó6?2?2ªº¤ñ¨Ò¤Á¤À°V½m¡B¶}µo»P´ú¸Õ¶°¡C¹ï¤è¨¥´ú¸Õ¶°¶i¦æ¶i¤@¨B½s¿è¤ÀÃþ¡A³q¹L°Å¿è§Î¦¨3 s»P10 sªº´ú¸Õ¶°¡A¦A¹ï´ú¸Õ¼Æ¾Ú¶°¤¤¤À§O¥[¤J¥Õ¾¸Án¡A§Î¦¨«H¾¸¤ñ¤À§O¬°3¡B10 dBªº´ú¸Õ¶°¡A±q¦Ó³Ì²×§Î¦¨¦@4¥÷´ú¸Õ¼Æ¾Ú¶°¡A¨C¥÷´ú¸Õ¶°¤¤§¡§t1207¥÷´ú¸Õ¼Ë¥»¡C ªí1¡@¦¿Ä¬¬Ù¦U¦a°Ï³¡¤À¨å«¬«U»y Table 1 Typical dialects spoken in Jiangsu province ¤è¨¥°Ï | º~»yª`µ | ´¶³q¸Ü | ¬s³®¤p¤ù
| ¨Õ¨Ú¨Å¨â | ¼F®` | ¨Ñ¨ç¨Ý¨Î¨á | ®É»ì¸Ü | ¨ÐبݨÑبä | ³»¼L | ¨Ï¨â¨ç¨É¨è¨ã | ¦Y¶º | ¨Î¨è¨ã¨Í¨à | ºÎı | ¨ä¨Ù¨â¨Ï¨è¨Û | ¶Ì³J | ¬x±_¤ù
| ¨Ñ¨é¨Ò¨ç¨â | ³L°C | ¨æ¨ç¨×¨ç | ÅܺA | ¨Ò¨ç¨Ú¨Î¨Ü¨×¨ç | ¤U¤Ú | ¨É¨ç¨å¨Å¨â¨æ | §¾ªÑ | ¨Î¨â¨Î¨è¨å | »¡ÁÀ | ¨È¨â¨Ò¨ç¨â | °Q¹½ | ®}²a¤ù
| ¨Í¨ß¨Ò¨ç¨Ý | ªþªñ | ¨Ø¨Þ¨Ï¨è¨Þ | ¤ò¯f | ¨Ì¨ç¨à¨Ø¨Þ | §ä¨Æ | ¨É¨ä¨ç¨ä | ¤¤¶¡ | ¨Ì¨Ú¨Í¨è¨Ú | ²á¤Ñ | ¨Ì¨Þ¨É¨Þ¨Ï¨è¨å | Ã廽 | Ĭº¹Å¤p¤ù
| ¨Ò¨ç¨ä¨Ç¨Ú | §n¬[ | ¨Ò¨ç¨à¨Ë¨ç¨ä¨é | ¤p¤k«Ä | ¨Ö¨è¨â¨Ñ¨ç | ¤£²n | ¨ç¨Ü¨Î¨è¨Þ¨Ê¨á | ³Ä±ß | ¨Ì¨Þ¨Ç¨à | ÀYµo | ¨Ù¨ä¨Í¨Ú¨Ì¨ä | ³}µó | ®õ¦p¤ù
| ¨ç¨Ú¨Ì¨ç | ±ß¤W | ¨è¨Ò¨é | ¤p«K | ¨Í¨Ú¨Í¨Ú | ¦^®a | ¨Õ¨è¨ß¨Ð¨ç | «Cµì | ¨Ö¨á¨Å¨Þ | §j¤û | ¨Ò¨ç¨â¨É¨á¨×¨Þ | è¤~ |
2.2µû»ù¼Ð㠨ϥΫإߪº¦¿Ä¬¤è¨¥»y®Æ¶°»Pºc«Øªº¦Uºôµ¸¼Ò«¬¶i¦æ°V½m¡BÅçÃÒ»P´ú¸Õ¡C¥Ñ¤_«Ø¥ßªººôµ¸¼Ò«¬©Ò¸Ñ¨Mªº¬°³¬¶°¦h¤ÀÃþ°ÝÃD¡A¦]¦¹ã½T²v¤£¦A¬O°ß¤@ªºµû»ù«ü¼Ð¡C¦Ò¼{¨ì¦U¤ÀÃþ¤Uªº¼Ë¥»¼Æ¬Û¹ï§¡¿Å¡A¦]¦¹¥»¹êÅç¦P®Éªö¥Î¦h¤ÀÃþµû»ù«ü¼ÐPRF¤U¤£±aÅv«ªº§»¥§¡Macro-F1©M·L¥§¡Micro-F1§@¬°¦U¹êÅç¼Ò«¬ªºµû»ù«ü¼Ð{16}¡C ¸Óµû»ù«ü¼Ð±N¦h¤ÀÃþ°ÝÃD¤À¸Ñ¬°NÓ¤G¤ÀÃþ°ÝÃD¡A§Y²£¥Í¦hÓ²V²c¯x°}¡C¨CÓ²V²c¯x°}¤¤¬Y¦a°ÏMªº¤è¨¥¼Ë¥»³Q¥¿½T¤ÀÃþ¨ìM¡A°O¬°¯u¥¿¨ÒTP ¡F¤£Äݤ_¬Y¦a°ÏMªº¤è¨¥¼Ë¥»³Q¿ù»~¤ÀÃþ¨ìM¡A°O¬°°²¥¿¨ÒFP¡CÄݤ_¦a°Ï¤è¨¥Mªº¼Ë¥»³Q¿ù»~¤ÀÃþ¬°«DM¦a°Ï¤è¨¥®É¡A°O¬°°²t¨ÒFN¡C 1¡^Micro-F1¼Ðã Micro-F1¬O°ò¤_·L¬dã²v©M·L¬d¥þ²vªº½Õ©M¥§¡©w¸qªº¡Cpºâ¤½¦¡¦p¤U¡G ·L¬dã²v¡G¡]¤½¦¡²¤¡^¡]1¡^ ·L¬d¥þ²v¡G¡]¤½¦¡²¤¡^¡]2¡^ micro-F1=2¡ÑmicroP¡ÑmicroR/microP+microR¡]3¡^ ¨ä¤¤¡A¡]¤½¦¡²¤¡^¡B¡]¤½¦¡²¤¡^¡B¡]¤½¦¡²¤¡^ªí¥Ü¦U²V²c¯x°}¤¤¹ïÀ³¦U¼ÆȪººâ¼Æ¥§¡¡C 2¡^Macro-F1¼Ðã º¥ý»Ýn®Ú¾Ú¨CÓ¤ÀÃþªº²V²c¯x°}pºâ¨ä¬dã²vP©M¬d¥þ²vR¡AµM«ápºâ¥§¡È¡]¤£¦Ò¼{¼Ë¥»Åv«¡^Àò±omacroP»PmacroR¡A³Ì«á¨D±oMacro-F1¡C¨ãÅ餽¦¡¦p¤U¡G §»¬dã²v¡G¡]¤½¦¡²¤¡^¡]4¡^ §»¬d¥þ²v¡G¡]¤½¦¡²¤¡^¡]5¡^ micro-F1=2¡ÑmicroP¡ÑmicroR/microP+microR¡]6¡^ ¨ä¤¤n¬°¼Ë¥»¤ÀÃþ¼Æ¡A¥»¹êÅ礤®Ú¾Ú¤è¨¥°ÏÓ¼Æn=5¡C§»¥§¡Macro-F1»P·L¥§¡Micro-F1ªº¼Æ¦r¤Ï¬M¤F¼Ò«¬ªºÃ©w©Ê¡A¨ä¨úȽd³ò¬°[0,1]¡C¦b¥R¤À¦Ò¼{¼Ò«¬ªx¤Æ¯à¤Oªº«e´£¤U¡A¨ä¼ÆȶV¤j¥Nªí¼Ò«¬©Ê¯à¶Véw¡C 2.3ºôµ¸¼Ò«¬¹ê²{ ±N°ò¤_SOM¡BRNN¡BLSTM¡BCNN¦@4Ãþ¼Ò«¬¬[ºc¡A¹Á¸Õ³ÌÀu°Ñ¼Æ³]¸m¤è®×¶i¦æ°V½m»PÅçÃÒ¡C 2.3.1¼Æ¾Ú³W¾ã °w¹ïSOM¡BRNN¡BLSTM¼Ò«¬¡A§Q¥Îkaldi¤u¨ã¥]´£¨úMFCC¯S©º¡A¨Ã¿ï¾Ü¨Ï¥ÎMFCC¤Î¨ä¤@¶¥®t¤À¡B¤G¶¥®t¤À¨t¼Æ¦@36ºû¯S©º§@¬°ºôµ¸¿é¤J¡Aªö¥Î³ÅùظÅÜ´«´Vªø¬°25 ms¡A´V²¾¬°10 ms¡A¥[µ¡Ãþ«¬¬°º~Ú¬µ¡¡C¬°¨Ï¿é¤J¯S©ºº¡¨¬CNN¼Ò«¬ªº¤Gºû¼Æ¾Ú®æ¦¡¡A¹êÅ礤¨Ï¥Îpython¤¤PIL¹Ï¹³³B²z®w¨ç¼Æ¹ï¹Ï¤ù¶i¦æ³W¾ã¡Cº¥ý¨Ï¥Îspecgram¨ç¼Æ¤èªk±N¤@ºûµÀW¼Æ¾ÚÂà´«¦¨¤GºûÃйϮ榡¡A¦p¹Ï4¡C±µµÛ¨Ï¥Îcrop¨ç¼Æ¤èªk§R°£®É¶¡ºû«×¤¤ªºªÅ¥Õ°Ï°ì¡C³Ì«á±NÀWÃйϤ¤ÀW²vºû«×»P®É¶¡ºû«×ªºªø«×«O«ù¤@P¡A¨Ï¥Îresize¨ç¼Æ¤èªk³W¾ã¹Ï¤ù¤Ø¤o¬°224¡Ñ224¡C ¡]¹Ï²¤¡^ ¹Ï4¡@¯S©ºÀWÃÐ¹Ï Fig.4 Characteristic frequency spectrum 2.3.2ºôµ¸³]¸m 1¡^SOM¼Ò«¬ ¹êÅç³q¹LSOM¹ï36ºûMFCC¯S©º¶i¦æ»EÃþ¡A¨Ã¿ï¾Ü¥Hµ²ºc·ÀI³Ì¤p¤Æ¬°ã«hªº¤ä«ù¦V¶q¾÷SVM§@¬°¤ÀÃþ¾¹¡C¥Ñ¤_SOM-SVM¼Ò«¬¤£¨ü¼Ë¥»¯S©ººû«×ªº¼vÅT¡A¦]¦¹¥i¥H¦b¤p¼Ë¥»ªÅ¶¡±ø¥ó¤U¨ú±o¦nªº¤ÀÃþ®ÄªG¡C ¸g¹L¹êÅç¤ÀªR¡A¦bSVM¤ÀÃþ¾¹®Ö¨ç¼Æ¿ï¾Ü¤W¡A¿ï¾Ü°ª´µ®|¦V°ò¨ç¼Æ§@¬°®Ö¨ç¼Æ¡AÃg»@¨t¼ÆC=150¡A°V½m»~®t=0.002¡C 2¡^RNN\LSTM¼Ò«¬ °V½m¤¤µo²{¾Ç²ß³t²v¡BÁôÂühµ¥¶W°Ñ¼Æ³]¸m¹ï¼Ò«¬¤ÀÃþ©Ê¯à·|²£¥Í¸û¤j¼vÅT¡A¦]¦¹¤£Â_¹Á¸Õ«á¿ï¾ÜÁôÂüh¼h¼Æ¬°3¡A¦¹®Éã½T²v¬°³Ì¨Îª¬ºA¡C¹êÅ礤¦P®É¹Á¸Õ¤£Â_¼W¥[ÁôÂüh¸`ÂI¼Æ¡Aµo²{·í¯«¸g¤¸¼Æ¥Ø¹F¨ì128®É¼Ò«¬°V½m®É¶¡ÅãµÛ¼W¥[¡A¦]¦¹¯«¸g¤¸¼Æ¥Ø³Ì²×¿ï¾Ü³]¸m¬°256¡C¿ï¾Üsigmoid¿E¬¡¨ç¼Æ¨Ãªö¥Î¥æ¤eæi·l¥¢¨ç¼Æ¡C¬°¥¿Å¦¬Àijt«×»P¡¥N¦¸¼ÆªºÃö¨t¡A³]©wbatch_size¬°64¡A§¹¦¨¤@¦¸epoch»Ý¡¥N¬ù60¦¸¡C¾Ç²ß²v°I´î¿ï¥Î©T©w¨Bªø°I´îµ¦²¤¡A³q¹L¤£Â_½Õ¸Õ³]©w¾Ç²ß³t²vªì©lȬ°0.05¡A¨C20¦¸epoch°§C0.01¡AÀu¤Æ¾¹«h©w¸q¬°AdamOptimizer¡C ¦b¤Wz°Ñ¼Æ±ø¥ó¤U¡A·l¥¢¨ç¼Æ¦b°V½m15¦¸epoch¥ª¥k®ÉÁͦV¥Ã¡C¹êÅç¦b¬ÛÀ³ªºRNN/LSTMµ²ºc°ò¦¤W³q¹LÄvª§¿é¥X¼Ò¦¡¡A¹ï¦U¸`ÂI©ÒÄݤ訥°Ï°ì¶i¦æ¤ÀÃþ¡C 3¡^CNN¼Ò«¬ °Ñ¦Ò¤åÄm{17}ªº¨÷¿n¼hµ²ºc«Ø¥ß¹êÅç¼Ò«¬¡A¸Ó¼Ò«¬¥Ñ6Ó¨÷¿n¼h»P3Ó¥þ³s±µ¼h²Õ¦¨¡A¶È¦b²Ä1Ó¨÷¿n¼h«á¼W¥[1Ó¦À¤Æ¼h¡A¿ï¾Ümaxout§@¬°¿E¬¡¨ç¼Æ¡CCNN¼Ò«¬ªº¿é¤J¤j¤p¬°224¡Ñ224¡Ñ3¡B¿ï¾Ü¨÷¿n®Ö¤Ø¤o¬°3¡Ñ3¡A¨Bªø¬°1¡A¦À¤Æ¤Ø¤o¿ï¾Ü3¡Ñ1¡A§Y¶È¦bÀW°ì½d³ò¶i¦æ¦À¤Æ¡C padding°Ñ¼Æ³]©w¬°same¡A§Y§¹¦¨¨C¦¸¨÷¿n«á¨Ï¥Î¹s¶ñ¥R¡]zero-padding¡^±±¨î¯S©ºÃйϤj¤p¡A¼Ò«¬°Ñ¼Æ¶q¬ù¬°4.1M¡C¸g¹L¤Ï½Æ¹êÅç¡A½T»{¶W°Ñ¼Æ³]©w¬°mini-batch=32¡A¡¥N¦¸¼Æepoch=150¡A¨ä¤¤«e100¦¸¡¥N¾Ç²ß³t«×0.05¡A«á50¦¸¡¥N¾Ç²ß³t²v°¬°0.005¡C³Ì«á¥þ³s±µ¼h¿é¥X³q¹Lsoftmaxpºâª¬ºA«áÅç·§²v¨Ã±o¨ì¤ÀÃþµ²ªG¡C 3µ²ªG»P°Q½× ¬°ÅçÃÒ¤£¦P¼Æ¾Ú¶°¼Ë¥»¼Æ¶q¡B«H¾¸¤ñ»P¼Ë¥»®Éªø¹ï¤è¨¥¤ÀÃþ²£¥Íªº¼vÅT¡A¿ï¾Ü4Ãþ¯«¸gºôµ¸¼Ò«¬¹ï4Ãþ´ú¸Õ¼Æ¾Ú¶°¶i¦æ¦¿Ä¬¬Ù¤º¤è¨¥¤ÀÃþ´ú¸Õ¡A¨Ï¥Îsklearn®w¨ç¼Æf1_scorepºâ±o¨ì¤£¦P¼Æ¾Ú¶°¤Uªº¤è¨¥¤ÀÃþ²V²c¯x°}»Pµû»ù«ü¼ÐMicro-F1È¡BMacro-F1È¡C¦b10 s/10 db¼Æ¾Ú¶°±ø¥ó¤UªºLSTM¼Ò«¬»PCNN¼Ò«¬¤ÀÃþ²V²c¯x°}¦p¹Ï5©Ò¥Ü¡C«ö·Ó¤½¦¡¡]1¡^~¡]6¡^¡A°ò¤_±o¨ìªº²V²c¯x°}¶i¤@¨Bpºâ¥i±o¨ì¤£¦P´ú¸Õ¶°¤U¦Uºôµ¸¼Ò«¬ªºµû»ù«ü¼ÐMicro-F1È¡BMacro-F1È¡C§»¥§¡Macro-F1¼ÆÈ®e©ö¨ü¨ì´ú¸Õ¶°¤¤¦U¤è¨¥¼Ë¥»¼Æ¤£§¡¿Åªº¼vÅT¡A¨ãÅé¦pªí2©Ò¥Ü¡C ¡]¹Ï²¤¡^ ¹Ï5¡@10s/10db¼Æ¾Ú¶°¤Uªº¤è¨¥¤ÀÃþ²V²c¯x°}¡]a. LSTM; b. CNN¡^ Fig.5 Confusion matrices with 10s/10dB dataset for dialect identification ¡]a. LSTM; b. CNN¡^ ªí2¡@¤£¦P´ú¸Õ¶°¦Uºôµ¸¼Ò«¬ªº©Ê¯àµû»ù Table 2 Performance evaluation of network models with different test aggregates ¼Ò«¬
| Micro-F1È | Macro-F1È | 3db | 10db | 3db | 10db | 3s | 10s | 3s | 10s | 3s | 10s | 3s | 10s | SOM | 0.522 | 0.565 | 0.701 | 0.706 | 0.513 | 0.568 | 0.695 | 0.701 | RNN | 0.659 | 0.672 | 0.717 | 0.723 | 0.647 | 0.667 | 0.718 | 0.720 | LSTM | 0.747 | 0.780 | 0.885 | 0.931 | 0.711 | 0.769 | 0.876 | 0.927 | CNN | 0.824 | 0.831 | 0.849 | 0.876 | 0.819 | 0.821 | 0.842 | 0.860 |
±qªí2ªº¹êÅçµ²ªG¤¤¥i¥H¬Ý¥X¡ALSTM¼Ò«¬Àò±o¤F³Ì¨Îµû»ù«ü¼ÐÈ¡A¥BLSTM»PCNN¼Ò«¬ªºµû»ù«ü¼Ð©úÅãÀu¤_SOM»PRNN¼Ò«¬¡CSOM¯«¸gºôµ¸¯à°÷¸û¦n¦a¾AÀ³¥»¹êÅç¼Æ¾Ú¶°±ø¥ó¤Uªº¤ÀÃþ¡A¦ýSOM¯«¸gºôµ¸©úÅã¹ï¼Ë¥»«H¾¸¤ñ§ó¥[±Ó·P¡A¦b3db´ú¸Õ¶°¤UªºÃѧOã½T©Ê¸û®t¡CRNNªº¤è¨¥ÃѧOã½T²v±o¨ì¤F¤@©wªº´£¤É¡A¦ý°w¹ïªø®É»yµ¤¤®É¶¡¶¡¹j¸ûªø«n«H®§ªºªí©º¯à¤O¤£¨¬¡C¥Ñ¤_LSTM§Q¥Î¦Û¨¼Ò«¬µ²ºc³æ¤¸¯SÂI§JªA¤FRNNªº¤Wz¯ÊÂI¡A¦]¦¹¦b10s´ú¸Õ¶°¤U±o¨ì¤F³Ì¨Îµû»ù«ü¼ÐÈ¡C CNN¼Ò«¬ªºÃѧOã½T²v¸ûéw¡A¤£©ö¨ü¨ì»yµ¤ù¬qªø«×ªº¼vÅT¡A¥B¹ï»yµ¾¸Án·FÂZ¨ã¦³¸û±jªº¾|´Î©Ê¡C³o¬O¥Ñ¤_CNN¼Ò«¬¤¤ªº¨÷¿n¡B¦À¤Æµ¥«D½u©ÊÅÜ´«¾Þ§@¨ã¦³§ó¦nªº«D½u©Êªí¹F¡A¹ï¤è¨¥¤ÀÃþÀ³¥Îªº«H®§ªí©º¨ã¦³¸û¦nªºÀÀ¦X¯à¤O¡A¦ý»Ýnª`·Nªº¬OCNN¦bÃѧO³t«×¤W¤£¯àº¡¨¬¹ê®É©Êªºn¨D¡C ¹êÅçµ²ªGÅã¥Ü¡A¦b§Ü¾¸©Ê¯à¤W¡A¦U¹êÅç©Ò¥Îºôµ¸¼Ò«¬¦b¤£¦P«H¾¸¤ñªº¼Æ¾Ú¶°±ø¥ó¤U±o¨ìªºµû»ù«ü¼ÐMicro-F1È¡BMacro-F1È»P«H¾¸¤ñ¼Æȧ¡§e²{¥¿¬ÛÃö¡C¦P®É¡A¼Æ¾Ú¼Ë¥»®Éªø¤]»Pµû»ù«ü¼ÐȦs¦b¥¿¬ÛÃöÃö¨t¡C³oªí©ú¦Uºôµ¸¼Ò«¬¦bµu®É»yµ«H®§´£¨ú¡B´£¤É§Ü¾¸¾|´Î©Ê¤W¨ú±o¤F¤@©wªº¶i¨B¡A¦ý»yµ¼Ë¥»ªº®Éªø»P«H¾¸¤ñ¤´µM¬O¼vÅT¼Ò«¬¤ÀÃþºë«×ªº«n¦]¯À¤§¤@¡C 4µ²½×»P®i±æ §Q¥Î²`«×¾Ç²ß®Ø¬[¸Ñ¨M¤è¨¥¦Û°ÊÃѧO¥ô°È¬O¥¼¨Óªº«n¬ã¨s¤è¦V¡A¥»¤å¬ã¨s¤FSOM¡BRNN¡B LSTM¡BCNNºôµ¸¼Ò«¬¤UªºÁn¾Ç«Ø¼Ò¹ê²{»P«H®§ªí©º¤ÀªR¡C³q¹L¹êÅçµ²ªGªº¤ÀªR¤ñ¸û¡A½×ÃÒ¤FCNN¡B LSTMºôµ¸¼Ò«¬§¡¨ã¦³¤è¨¥Án¾Ç«H®§ªº¨}¦nªí©º¡A¥BCNN§Ü¾¸¯à¤O¸û±j¡A®i¥Ü¤FCNN»PLSTMªºÁp¦XÀ³¥Î¦b¥¼¨Ó¶i¦æ¤è¨¥¦Û°Ê¤ÀÃþ¤G¦¸¶}µo¨ã¦³¤@©wªº¥i¦æ©Ê¡C¦P®É¹êÅçµ²ªG¤]´¦¥Ü¤F¼ÐãÁn¯¾®w«Ø³]»Ý«Ø¥ß³W½dªºÁn¯¾¼Ë¥»ªö¶°¼Ðã¡A¥i¬°Án¯¾ÃѧO§Þ³N¦b¤½¦w¹ê¾Ô¤¤µo´§¾Ô°«¤O´£¨Ñ«n«OÃÒ¡C ¡D¡D¡D¡D¡D¡D
Dear visitor,you are attempting to view a subscription-based section of lawinfochina.com. If you are already a subscriber, please login to enjoy access to our databases . If you are not a subscriber, please subscribe . Should you have any questions, please contact us at: +86 (10) 8268-9699 or +86 (10) 8266-8266 (ext. 153) Mobile: +86 133-1157-0713 Fax: +86 (10) 8266-8268 database@chinalawinfo.com
|