Replication package

The Impact of Feature Reduction Techniques on Defect Prediction Models

by Masanari Kondo, Cor-Paul Bezemer, Yasutaka Kamei, Ahmed E. Hassan, Osamu Mizuno
Submitted to the Empirical Software Engineering journal (EMSE)


How to use?

$ cd src/
$ bash conduct.sh # conduct all experiments (take SUPER long time (not 1 or 2 weeks), please check our scripts)

The figures and tables are generated in:

./src/FR_SVL/test/performance_plot
./src/FR_SVL/test/variance_plot
./src/FR_USVL/test/performance_plot
./src/FR_USVL/test/variance_plot
./src/FR_USVL/test/clustering_plot
./src/FR_USVL/test/heatmap_plot
./src/FR_USVL/test/quantile
./src/FR_MetricsAnalysis/discussion/clustering_plot
./src/FR_MetricsAnalysis/discussion/heatmap_plot
./src/Figures/performance_plot
./src/RQ3/SVL/test/performance_plot
./src/RQ3/SVL/test/variance_plot
./src/RQ3/USVL/test/performance_plot
./src/RQ3/USVL/test/variance_plot
./src/RQ3/Tables/tables
./src/Tables/tables

Environment:

We made the environment by Docker for this replication package. In this environment, we can use almost the same environment of our servers.

Get started:

$ bash startup.sh
$ cd ./setup
$ bash setup.sh

Then, your current terminal will be the virtual machine's terminal that has the environment.

REQUIREMENTS (if you do not use Docker):

python (3.5.0):
chardet==2.2.1
colorama==0.2.5
cycler==0.10.0
html5lib==0.999
matplotlib==2.0.0
numpy==1.12.1
pandas==0.20.3
protobuf==3.1.0
pycurl==7.19.3
pygobject==3.12.0
pyparsing==2.2.0
PypeR==1.1.2
python-apt===0.9.3.5ubuntu2
python-dateutil==2.7.3
pytz==2018.5
requests==2.2.1
scikit-learn==0.19.1
scipy==0.18.0
six==1.11.0
tensorflow==0.12.0
unattended-upgrades==0.1
urllib3==1.7.1
python3-tk (need to use apt-get)

R (3.4.2):
rJava
RWeka
ranger
cluster
cclust
caret
e1071
FSelector
klaR
effsize
car
reshape2
gplots
Rtsne
ScottKnottESD

Java >= 1.8.0

others
: libcurl4-openssl-dev (apt-get install libcurl4-openssl-dev)

Download

Download the replication package here.