Refine
Document Type
Language
- English (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- Corporate Credit Risk (1)
- Kreditrisiko (1)
- Twitter <Softwareplattform> (1)
- Twitter analysis (1)
- Unternehmen (1)
- User Generated Content (1)
- credit risk (1)
- sentiment dictionaries (1)
- text mining (1)
- user generated content (1)
Institute
Since textual user generated content from social media platforms contains valuable information for decision support and especially corporate credit risk analysis, automated approaches for text classification such as the application of sentiment dictionaries and machine learning algorithms have received great attention in recent user generated content based research endeavors. While machine learning algorithms require individual training data sets for varying sources, sentiment dictionaries can be applied to texts immediately, whereby domain specific dictionaries attain better results than domain independent word lists. We evaluate by means of a literature review how sentiment dictionaries can be constructed for specific domains and languages. Then, we construct nine versions of German sentiment dictionaries relying on a process model which we developed based on the literature review. We apply the dictionaries to a manually classified German language data set from Twitter in which hints for financial (in)stability of companies have been proven. Based on their classification accuracy, we rank the dictionaries and verify their ranking by utilizing Mc Nemar’s test for significance. Our results indicate, that the significantly best dictionary is based on the German language dictionary SentiWortschatz and an extension approach by use of the lexical-semantic database GermaNet. It achieves a classification accuracy of 59,19 % in the underlying three-case-scenario, in which the Tweets are labelled as negative, neutral or positive. A random classification would attain an accuracy of 33,3 % in the same scenario and hence, automated coding by use of the sentiment dictionaries can lead to a reduction of manual efforts. Our process model can be adopted by other researchers when constructing sentiment dictionaries for various domains and languages. Furthermore, our established dictionaries can be used by practitioners especially in the domain of corporate credit risk analysis for automated text classification which has been conducted manually to a great extent up to today.
During the Corona-Pandemic, information (e.g. from the analysis of balance sheets and payment behavior) traditionally used for corporate credit risk analysis became less valuable because it represents only past circumstances. Therefore, the use of currently published data from social media platforms, which have shown to contain valuable information regarding the financial stability of companies, should be evaluated. In this data e. g. additional information from disappointed employees or customers can be present. In order to analyze in how far this data can improve the information base for corporate credit risk assessment, Twitter data regarding the ten greatest insolvencies of German companies in 2020 and solvent counterparts is analyzed in this paper. The results from t-tests show, that sentiment before the insolvencies is significantly worse than in the comparison group which is in alignment with previously conducted research endeavors. Furthermore, companies can be classified as prospectively solvent or insolvent with up to 70% accuracy by applying the k-nearest-neighbor algorithm to monthly aggregated sentiment scores. No significant differences in the number of Tweets for both groups can be proven, which is in contrast to findings from studies which were conducted before the Corona-Pandemic. The results can be utilized by practitioners and scientists in order to improve decision support systems in the domain of corporate credit risk analysis. From a scientific point of view, the results show, that the information asymmetry between lenders and borrowers in credit relationships, which are principals and agents according to the principal-agent-theory, can be reduced based on user generated content from social media platforms. In future studies, it should be evaluated in how far the data can be integrated in established processes for credit decision making. Furthermore, additional social media platforms as well as samples of companies should be analyzed. Lastly, the authenticity of user generated contend should be taken into account in order to ensure, that credit decisions rely on truthful information only.