Welcome! The following statistics provide some visusal insights into NAPKON Public Data Set. The Public Data Set constitutes patient data from the NAPKON after a data cleaning process and includes data from patients documented until May 16, 2023. The NAPKON Public Data Set is originating from the NAPKON SecuTrial. The data anonymization pipeline is described by Jakob et al. in "Design and evaluation of a data anonymization pipeline to promote Open Science on COVID-19". The public data is anonymized using our "data protection concept". The anonymization process was carried out with the "ARX software"
Copyright: This work is licensed under the Creative Commons Attribution Non-Commercial 4.0 License. With the use of this data you agree to include a proper acknowledgement of the NAPKON study group in any work based on the data set. By working with this notebook you agree to maintain the confidentiality of the data set at all times and to not attempt to compromise or otherwise violate the privacy of the patients described. To view a copy of the license, visit https://creativecommons.org/licenses/by-nc/4.0/.
If you have any comments on the notebook, please drop us a message at support@napkon.de.
Here we provide information on the basic structure of the NAPKON Public Data Set.
The data set consists of 5702 patients and 15 variables. A row represents anonymized data of a single patient.
The columns are described by the variables:
For further information regarding the variables please refer to: https://cloud.idcohorts.net/s/ELRzpzgBkKb5ejY and https://cloud.napkon.de/s/W2PQpmqzoRkjABL
*The Clinical Phases are defined according to the WHO clinical progression scale:
To get to know the Public Data Set better, the values of variables are shown below according to the used data set. Please be aware that the Public Data Set is only a part of the complete NAPKON data set. Anonymization processes may lead to variables having less values than in the complete NAPKON data set. For example the variable 'gender' can also have the value 'diverse', but there is no patient with this gender in the Public Data Set.
n/a: In cases where the patient was not in the respective phase a variable refers to, the variable has been given the value 'Not applicable (N/a)'. If for example a patient has never been in the completion of the 3-month follow-up, 'any symptoms at the 3-month follow-up' is a variable which is not applicable to this patient.
The following descriptive statistics are computed in this section:
Number of patients from HAP is lower than from SUEP and POP. Therefore please pay attention to difference in scaling especially in HAP.
The total number of patients is 5702.
Only cases with complete documentation and Review A status are considered.
The number of patients for SUEP is 2138.
The number of patients for POP is 3252.
The number of patients for HAP is 312.
The following descriptive statistics on the health status at the end of medical consultation are computed in this section:
Note that we will use a filtered data set for computing the rates, which we describe below.
The number of patients for SUEP is 2138.
The number of patients for POP is 3252.
The number of patients for HAP is 312.
The number of patients for SUEP is 2138.
The number of patients for POP is 3252.
The number of patients for HAP is 312.
The number of patients for SUEP is 2138.
The number of patients for POP is 3252.
The number of patients for HAP is 312.
For the COVID-19 mortality and recovery rate computations, we exclude patients with a documented health status at the end of medical consultation of 'unknown/missing'. Please note that this influences the following computations and plots.
The number of patients in the filtered data set for SUEP is 2047.
The number of patients in the filtered data set for POP is 3111.
The number of patients in the filtered data set for HAP is 300.
SUEP
COVID-19 Overall Mortality Rate: 5.96 %
COVID-19 Overall Recovery Rate: 94.04 %
POP
All POP patients survived at least 6 to 12 months after first diagnosis.
HAP
COVID-19 Overall Mortality Rate: 16.0 %
COVID-19 Overall Recovery Rate: 84.0 %
All POP patients survived at least 6 to 12 months after first diagnosis. Thus POP is not shown in the following graphic.
SUEP
HAP
All POP patients survived at least 6 to 12 months after first diagnosis. Thus POP is not shown in the following graphic.
SUEP
HAP
SUEP
HAP
From here on we will indicate the three clinical phases as
In the following we will plot the:
Since there might be patients who have no phase documented at all we need to proceed with a filtered data set in which those patients are dropped.
The number of patients which contain at least mid, moderate or severe phase in this filtered data set is 5702.
Maximum phase reached by the patients.