Unstructured data in the electronic health records contain essential patient information. Natural language processing (NLP), teaching a computer to read, allows us to tap into these data without needing the time and effort of manual chart abstraction. The core first step for all NLP algorithms is preprocessing the text to identify the core words that differentiate the text while filtering out the noise. Traditional NLP uses a rule-based approach, applying grammatical rules to infer meaning from the text. Newer NLP approaches use machine learning/deep learning which can infer meaning without explicitly being programmed. NLP use in nephrology research has focused on identifying distinct disease processes, such as CKD, and extraction of patient-oriented outcomes such as symptoms with high sensitivity. NLP can identify patient features from clinical text associated with acute kidney injury and progression of CKD. Lastly, inclusion of features extracted using NLP improved the performance of risk-prediction models compared to models that only use structured data. Implementation of NLP algorithms has been slow, partially hindered by the lack of external validation of NLP algorithms. However, NLP allows for extraction of key patient characteristics from free text, an infrequently used resource in nephrology.
- Machine learning