Welcome! The following statistics provide some visusal insights into NAPKON Public Data Set. The Public Data Set constitutes patient data from the NAPKON after a data cleaning process and includes data from patients documented until April 02, 2024. The NAPKON Public Data Set is originating from the NAPKON SecuTrial. The data anonymization pipeline is described by Jakob et al. in "Design and evaluation of a data anonymization pipeline to promote Open Science on COVID-19". The public data is anonymized using our "data protection concept". The anonymization process was carried out with the "ARX software"
Copyright: This work is licensed under the Creative Commons Attribution Non-Commercial 4.0 License. With the use of this data you agree to include a proper acknowledgement of the NAPKON study group in any work based on the data set. By working with this notebook you agree to maintain the confidentiality of the data set at all times and to not attempt to compromise or otherwise violate the privacy of the patients described. To view a copy of the license, visit https://creativecommons.org/licenses/by-nc/4.0/.
If you have any comments on the notebook, please drop us a message at support@napkon.de.
Here we provide information on the basic structure of the NAPKON Public Data Set.
The data set consists of 6012 patients and 16 variables. A row represents anonymized data of a single patient.
The columns are described by the variables:
For further information regarding the variables please refer to: https://cloud.idcohorts.net/s/ELRzpzgBkKb5ejY and https://cloud.napkon.de/s/W2PQpmqzoRkjABL
*The Clinical Phases are defined according to the WHO clinical progression scale:
To get to know the Public Data Set better, the values of variables are shown below according to the used data set. Please be aware that the Public Data Set is only a part of the complete NAPKON data set. Anonymization processes may lead to variables having less values than in the complete NAPKON data set. For example the variable 'gender' can also have the value 'diverse', but there is no patient with this gender in the Public Data Set.
age: 18 - 39 years, 40 - 59 years, 60 - 79 years, >= 80 years gender: female, male quarter_first_diagnosis: Q1, Q2, Q3, Q4, unknown/missing year_first_diagnosis: 2020, 2021, 2022, 2023, unknown/missing cohort: HAP, POP, SUEP mild_phase: no, yes moderate_phase: no, yes severe_phase: no, unknown/missing, yes patient_status_end_acute_phase: ambulant, dead, discharged, referral/transfer, unknown/missing hospitalisation: no, yes intensive_care_treatment: no, yes inv_ventilation: no, unknown/missing, yes availability_3month_followup: no/not yet, yes ability_to_work_3MFU: n/a, no, unknown/missing, yes any_symptom_3MFU: n/a, no, unknown/missing, yes
n/a: In cases where the patient was not in the respective phase a variable refers to, the variable has been given the value 'Not applicable (N/a)'. If for example a patient has never been in the completion of the 3-month follow-up, 'any symptoms at the 3-month follow-up' is a variable which is not applicable to this patient.
The following descriptive statistics are computed in this section:
Number of patients from HAP is lower than from SUEP and POP. Therefore please pay attention to difference in scaling especially in HAP.
The total number of patients is 6012.
Only cases with complete documentation and Review A status are considered.
The number of patients for SUEP is 2191.
The number of patients for POP is 3507.
The number of patients for HAP is 314.
The following descriptive statistics on the health status at the end of medical consultation are computed in this section:
Note that we will use a filtered data set for computing the rates, which we describe below.
SUEP | POP | HAP | |
---|---|---|---|
discharged | 1328 | 136 | 241 |
ambulant | 342 | 3214 | 0 |
referral/transfer | 302 | 0 | 18 |
dead | 126 | 0 | 46 |
unknown/missing | 93 | 157 | 9 |
palliative care | 0 | 0 | 0 |
The number of patients for SUEP is 2191.
The number of patients for POP is 3507.
The number of patients for HAP is 314.
The number of patients for SUEP is 2191.
The number of patients for POP is 3507.
The number of patients for HAP is 314.
The number of patients for SUEP is 2191.
The number of patients for POP is 3507.
The number of patients for HAP is 314.
For the COVID-19 mortality and recovery rate computations, we exclude patients with a documented health status at the end of medical consultation of 'unknown/missing'. Please note that this influences the following computations and plots.
The number of patients in the filtered data set for SUEP is 2098.
The number of patients in the filtered data set for POP is 3350.
The number of patients in the filtered data set for HAP is 305.
SUEP | POP | HAP | |
---|---|---|---|
recovery | 1972 | 3350 | 259 |
dead | 126 | 0 | 46 |
SUEP
COVID-19 Overall Mortality Rate: 5.99 %
COVID-19 Overall Recovery Rate: 94.01 %
POP
All POP patients survived at least 6 to 12 months after first diagnosis.
HAP
COVID-19 Overall Mortality Rate: 15.18 %
COVID-19 Overall Recovery Rate: 84.82 %
All POP patients survived at least 6 to 12 months after first diagnosis. Thus POP is not shown in the following graphic.
SUEP
Recovery rate | Mortality rate | |
---|---|---|
age | ||
18 - 39 years | 99.01 | 0.99 |
40 - 59 years | 97.02 | 2.98 |
60 - 79 years | 89.45 | 10.55 |
>= 80 years | 88.39 | 11.61 |
HAP
Recovery rate | Mortality rate | |
---|---|---|
age | ||
18 - 39 years | 91.67 | 8.33 |
40 - 59 years | 89.47 | 10.53 |
60 - 79 years | 77.87 | 22.13 |
>= 80 years | 0.00 | 0.00 |
All POP patients survived at least 6 to 12 months after first diagnosis. Thus POP is not shown in the following graphic.
SUEP
Recovery rate | Mortality rate | |
---|---|---|
gender | ||
female | 95.86 | 4.14 |
male | 92.73 | 7.27 |
HAP
Recovery rate | Mortality rate | |
---|---|---|
gender | ||
female | 85.71 | 14.29 |
male | 84.77 | 15.23 |
SUEP
patient_status_end_acute_phase | discharged | ambulant | referral/transfer | dead | All | |
---|---|---|---|---|---|---|
gender | age | |||||
female | 18 - 39 years | 5.10 | 4.29 | 0.48 | 0.05 | 9.91 |
40 - 59 years | 9.63 | 3.24 | 1.43 | 0.14 | 14.44 | |
60 - 79 years | 8.48 | 0.57 | 2.43 | 1.29 | 12.77 | |
>= 80 years | 2.19 | 0.00 | 0.81 | 0.19 | 3.19 | |
male | 18 - 39 years | 5.86 | 2.81 | 0.52 | 0.14 | 9.34 |
40 - 59 years | 13.97 | 4.15 | 3.24 | 0.95 | 22.31 | |
60 - 79 years | 15.82 | 1.14 | 4.29 | 2.57 | 23.83 | |
>= 80 years | 2.24 | 0.10 | 1.19 | 0.67 | 4.19 | |
All | 63.30 | 16.30 | 14.39 | 6.01 | 100.00 |
HAP
patient_status_end_acute_phase | discharged | referral/transfer | dead | All | |
---|---|---|---|---|---|
gender | age | ||||
female | 40 - 59 years | 9.51 | 0.98 | 0.98 | 11.48 |
60 - 79 years | 2.95 | 0.33 | 1.31 | 4.59 | |
male | 40 - 59 years | 37.38 | 2.30 | 4.92 | 44.59 |
60 - 79 years | 25.57 | 2.30 | 7.54 | 35.41 | |
18 - 39 years | 3.61 | 0.00 | 0.33 | 3.93 | |
All | 79.02 | 5.90 | 15.08 | 100.00 |
From here on we will indicate the three clinical phases as
In the following we will plot the:
Since there might be patients who have no phase documented at all we need to proceed with a filtered data set in which those patients are dropped.
The number of patients which contain at least mid, moderate or severe phase in this filtered data set is 6012.
Maximum phase reached by the patients.
count | ||
---|---|---|
cohort | max_phase | |
HAP | moderate | 189 |
severe | 125 | |
POP | mild | 3326 |
moderate | 140 | |
severe | 41 | |
SUEP | mild | 342 |
moderate | 1429 | |
severe | 420 |
In the following we will plot the:
! All POP patients have the first visit at least 6 to 12 months after first diagnosis.