Abstract:Sign language is the primary way for deaf people to communicate, but hearing people don't understand sign language. Researchers hope to use artificial intelligence technology to help deaf people integrate into society so that hearing people can understand the meaning of sign language. Sign language recognition and translation (SLRT) urgently need large-scale sign language video data to realize its practical application. And the improvement of data quality will also have a positive impact on its recognition and translation effect. To avoid the high cost of manual screening of massive data, this paper proposes a Two Information Streams Transformer (TIST) model to judge whether the quality of a sign language video is qualified. Under the condition of inconsistent sign language style, TIST can use two cross transformers to judge wrong videos where the signer misuses, misses some gestures, wrong gestures order, etc. And TIST also uses the temporal transformer to focus on important frames in the sign language temporal sequence. In addition, this paper also proposes the self-adaptive GCN to enhance the ability to extract sign gestures in skeletal nodes. Experimentally, TIST achieved state-of-the art sign language screening accuracy. To verify that data quality screening is effective in improving sign language recognition, this paper uses VAC as the baseline model. The experimental results show that the screened dataset can achieve better WER than the unscreened dataset.