Skip to content

Commit

Permalink
Merge pull request #229 from joba00002/artifact_attributes
Browse files Browse the repository at this point in the history
Vertex attributes for artifacts

Reviewed-by: Thomas Bock <bockthom@cs.uni-saarland.de>
  • Loading branch information
bockthom authored Oct 24, 2022
2 parents d68881d + aac337c commit 68c0515
Show file tree
Hide file tree
Showing 15 changed files with 2,265 additions and 227 deletions.
14 changes: 14 additions & 0 deletions .drone.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,20 @@ steps:
- apt-get install --assume-yes libxml2
- apt-get install --assume-yes libxml2-dev
- apt-get install --assume-yes libglpk-dev
- apt-get install --assume-yes libfontconfig1-dev
- echo 'options(repos = c(CRAN = "https://cloud.r-project.org"))' >>"/usr/local/lib/R/etc/Rprofile.site"
# package installation
- Rscript install.R
# execute test suite
- Rscript tests.R
depends_on: [clone]

- name: R-4.2
pull: if-not-exists
image: rocker/r-ver:4.2.1
commands: *runTests
depends_on: [clone]

- name: R-4.1
pull: if-not-exists
image: rocker/r-ver:4.1.3
Expand Down Expand Up @@ -96,13 +103,20 @@ steps:
- apt-get install --assume-yes libxml2
- apt-get install --assume-yes libxml2-dev
- apt-get install --assume-yes libglpk-dev
- apt-get install --assume-yes libfontconfig1-dev
- echo 'options(repos = c(CRAN = "https://cloud.r-project.org"))' >>"/usr/local/lib/R/etc/Rprofile.site"
# package installation
- Rscript install.R
# execute showcase file
- Rscript showcase.R
depends_on: [clone]

- name: R-4.2
pull: if-not-exists
image: rocker/r-ver:4.2.1
commands: *runShowcase
depends_on: [clone]

- name: R-4.1
pull: if-not-exists
image: rocker/r-ver:4.1.3
Expand Down
38 changes: 38 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,46 @@
- Incorporate custom event timestamps, i.e., add a configuration entry to the project configuration that allows specifying a file from which timestamps can be read, as well as an entry that allows locking this data; add corresponding functions `get.custom.event.timestamps`, `set.custom.event.timestamps` and `clear.custom.event.timestamps` (PR #227, 0aa342430ad3b354b9cf954dbe0838b056cf328a, 0f237d03913d2c940a008ea8fe84ba44817e77ea, c1803982357a3272b108f60cb1c976e3c2d9b1e5,
54e089db0ceea07db94914d02655a7f1f67d3117, 54673f8f88ca276ba06396116d802425093544d4, c5f5403430d55ceff6b6d5acbbca1ae9c5c231e2)
- Add function `split.data.time.based.by.timestamps` to allow using custom event timestamps for splitting. Alternatively, timestamps can be specified manually (PR #227, 5b8515f97da4a24f971be453589595d259ab1fa1, 43f23a83bc66e485fea371f958bbb2ce3ddbd8d0)
- Add the following vertex attributes for artifact vertices and corresponding helper functions.
(PR #229, 20728071ca25e1d20cfa05bc15feb3ecc0a1c434, 51b5478ae15598ed3e6115b22e440929f8084660, 56ed57a21cc8004262ebba88429d0649cb238a52, 9b060361b1d1352b5a431df3990e468df7cab572, 52d40ba3657e3c806516653626afd81018a14863, e91161c79b53be7ba8ce3bec65de01ea6be1c575)
- `add.vertex.attribute.artifact.last.edited`
- `add.vertex.attribute.mail.thread.contributer.count`, `get.mail.thread.contributor.count`
- `add.vertex.attribute.mail.thread.message.count`, `get.mail.thread.message.count`
- `add.vertex.attribute.mail.thread.start.date`, `get.mail.thread.start.date`
- `add.vertex.attribute.mail.thread.end.date`, `get.mail.thread.end.date`
- `add.vertex.attribute.mail.thread.originating.mailing.list`, `get.mail.thread.originating.mailing.list`
- `add.vertex.attribute.issue.contributor.count`, `get.issue.contributor.count`
- `add.vertex.attribute.issue.event.count`, `get.issue.event.count`
- `add.vertex.attribute.issue.comment.event.count`, `get.issue.comment.count`
- `add.vertex.attribute.issue.opened.date`, `get.issue.opened.date`
- `add.vertex.attribute.issue.closed.date`, `get.issue.closed.date`
- `add.vertex.attribute.issue.last.activity.date`, `get.issue.last.activity.date`
- `add.vertex.attribute.issue.title`, `get.issue.title`
- `add.vertex.attribute.pr.open.merged.or.closed`, `get.pr.open.merged.or.closed`
- `add.vertex.attribute.issue.is.pull.request`, `get.issue.is.pull.request`


### Changed/Improved
- Rename existing vertex attributes for author vertices to be distinguishable from attributes for artifact vertices.
With this change, the first word after `add.vertex.attribute.` now signifies the type of vertex the attribute applies to. (PR #229, 75e8514d1d2f6222d2093679f4418e9171d3abf2)
- `add.vertex.attribute.commit.count.author` -> `add.vertex.attribute.author.commit.count`
- `add.vertex.attribute.commit.count.author.not.committer` -> `add.vertex.attribute.author.commit.count.not.committer`
- `add.vertex.attribute.commit.count.committer` -> `add.vertex.attribute.author.commit.count.committer`
- `add.vertex.attribute.commit.count.committer.not.author` -> `add.vertex.attribute.author.commit.count.committer.not.author`
- `add.vertex.attribute.commit.count.committer.and.author` -> `add.vertex.attribute.author.commit.count.committer.and.author`
- `add.vertex.attribute.commit.count.committer.or.author` -> `add.vertex.attribute.author.commit.count.committer.or.author`
- `add.vertex.attribute.artifact.count` -> `add.vertex.attribute.author.artifact.count`
- `add.vertex.attribute.mail.count` -> `add.vertex.attribute.author.mail.count`
- `add.vertex.attribute.mail.thread.count` -> `add.vertex.attribute.author.mail.thread.count`
- `add.vertex.attribute.issue.count` -> `add.vertex.attribute.author.issue.count`
- `add.vertex.attribute.issues.commented.count` -> `add.vertex.attribute.author.issues.commented.count`
- `add.vertex.attribute.issue.creation.count` -> `add.vertex.attribute.author.issue.creation.count`
- `add.vertex.attribute.issue.comment.count` -> `add.vertex.attribute.author.issue.comment.count`
- `add.vertex.attribute.first.activity` -> `add.vertex.attribute.author.first.activity`
- `add.vertex.attribute.active.ranges` -> `add.vertex.attribute.author.active.ranges`
- Add parameter `use.unfiltered.data` to `add.vertex.attribute.issue.*`. This allows selecting whether the filtered or unfiltered issue data is used
for calculating the attribute. (PR #229, b77601dfa1372af5f58fb552cdb015401a344df7, 922258cb743614e0eeffcf38028acfc0a42a0332)
- Improve handling of issue type in vertex attribute name for `add.vertex.attribute.issue.*`. The default attribute name still adjusts to the issue type, but this no longer happens if the same name is specified manually. (PR #229, fe5dc61546b81c7779643c3b2b37c101a55217f8)

### Fixed

Expand Down
15 changes: 8 additions & 7 deletions showcase.R
Original file line number Diff line number Diff line change
Expand Up @@ -221,16 +221,16 @@ my.networks = lapply(cf.data, function(range.data) {
return (y$get.author.network())
})
## add commit-count vertex attributes
sample = add.vertex.attribute.commit.count.author(my.networks, x.data, aggregation.level = "range")
sample.cumulative = add.vertex.attribute.commit.count.author(my.networks, x.data, aggregation.level = "cumulative")
sample = add.vertex.attribute.author.commit.count(my.networks, x.data, aggregation.level = "range")
sample.cumulative = add.vertex.attribute.author.commit.count(my.networks, x.data, aggregation.level = "cumulative")
## add email-address vertex attribute
sample.mail = add.vertex.attribute.author.email(my.networks, x.data, "author.email")
sample.mail.thread = add.vertex.attribute.mail.thread.count(my.networks, x.data)
sample.issues.created = add.vertex.attribute.issue.creation.count(my.networks, x.data)
sample.pull.requests = add.vertex.attribute.issue.count(my.networks, x.data, issue.type = "pull.requests")
sample.mail.thread = add.vertex.attribute.author.mail.thread.count(my.networks, x.data)
sample.issues.created = add.vertex.attribute.author.issue.creation.count(my.networks, x.data)
sample.pull.requests = add.vertex.attribute.author.issue.count(my.networks, x.data, issue.type = "pull.requests")
## add vertex attributes for the project-level network
x.net.as.list = list("1970-01-01 00:00:00-2030-01-01 00:00:00" = x$get.author.network())
sample.entire = add.vertex.attribute.commit.count.author(x.net.as.list, x.data, aggregation.level = "complete")
sample.entire = add.vertex.attribute.author.commit.count(x.net.as.list, x.data, aggregation.level = "complete")


## / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
Expand Down Expand Up @@ -443,7 +443,7 @@ get.author.class.by.type(network = empty.network, type = "network.eigen")
get.author.class.by.type(proj.data = empty.range.data, type = "commit.count")
get.author.class.by.type(proj.data = empty.range.data, type = "loc.count")

## test function for mutliple ranges (evolution)
## test function for multiple ranges (evolution)
author.class.overview = get.author.class.overview(network.list = network.list, type = "network.degree")
get.author.class.overview(network.list = network.list, type = "network.eigen")
get.author.class.overview(range.data.list = range.list, type = "commit.count")
Expand All @@ -462,3 +462,4 @@ calculate.cohens.kappa(author.classification.list = author.class.overview,

get.class.turnover.overview(author.class.overview = author.class.overview)
get.unstable.authors.overview(author.class.overview = author.class.overview, saturation = 2)

3 changes: 3 additions & 0 deletions tests/codeface-data/configurations/testing/test_feature.conf
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ mailinglists:
- name: test
type: dev
source: gmane
- name: test2
type: dev
source: gmane

# date of first release:
# 2009-03-05
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ mailinglists:
- name: test
type: dev
source: gmane
- name: test2
type: dev
source: gmane

# date of first release:
# 2009-03-05
Expand Down
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
"Björn";"bjoern@example.org";"<adgkljsdfhkwafdkbhjasfcjn@mail.gmail.com>";"2004-10-09 18:38:13";200;"Re: Fw: busybox 202 with tab";1
"Björn";"bjoern@example.org";"<1107974989.17910.6.camel@jmcmullan>";"2005-02-09 18:49:49";-500;"Doubled date";2
"udo";"udo@example.org";"<asddghdswqeasdasd@mail.gmail.com>";"2010-07-12 10:05:36";200;"Only mail address";3
"Fritz fritz@example.org";"asd@sample.org";"<jlkjsdgihwkfjnvbjwkrbnwe@mail.gmail.com>";"2010-07-12 11:05:35";200;"name is mail address";4
"georg";"heinz@example.org";"<dfhglkjdgjkhnwrd@mail.gmail.com>";"2010-07-12 12:05:34";200;"name is mail address";5
"Hans";"hans1@example.org";"<hans1@mail.gmail.com>";"2010-07-12 12:05:40";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans2@mail.gmail.com>";"2010-07-12 12:05:41";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans3@mail.gmail.com>";"2010-07-12 12:05:42";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans4@mail.gmail.com>";"2010-07-12 12:05:43";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans5@mail.gmail.com>";"2010-07-12 12:05:44";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans6@mail.gmail.com>";"2010-07-12 12:05:45";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans7@mail.gmail.com>";"2010-07-12 12:05:46";200;"name is mail address";7
"Thomas";"thomas@example.org";"<saf54sd4gfasf46asf46@mail.gmail.com>";"";0;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= 2";8
"Björn";"bjoern@example.org";"<4cbaa9ef0802201124v37f1eec8g89a412dfbfc8383a@mail.gmail.com>";"2016-07-12 15:58:40";0;"Re: busybox 1";8
"Olaf";"olaf@example.org";"<6784529b0802032245r5164f984l342f0f0dc94aa420@mail.gmail.com>";"2016-07-12 15:58:50";-400;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= tab";8
"Thomas";"thomas@example.org";"<65a1sf31sagd684dfv31@mail.gmail.com>";"2016-07-12 16:04:40";100;"Re: Fw: busybox 2 tab";9
"Olaf";"olaf@example.org";"<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>";"2016-07-12 16:05:37";200;"Re: Fw: busybox 10";9
"Björn";"bjoern@example.org";"<adgkljsdfhkwafdkbhjasfcjn@mail.gmail.com>";"2004-10-09 18:38:13";200;"Re: Fw: busybox 202 with tab";"13#1"
"Björn";"bjoern@example.org";"<1107974989.17910.6.camel@jmcmullan>";"2005-02-09 18:49:49";-500;"Doubled date";"42#2"
"udo";"udo@example.org";"<asddghdswqeasdasd@mail.gmail.com>";"2010-07-12 10:05:36";200;"Only mail address";"13#3"
"Fritz fritz@example.org";"asd@sample.org";"<jlkjsdgihwkfjnvbjwkrbnwe@mail.gmail.com>";"2010-07-12 11:05:35";200;"name is mail address";"42#4"
"georg";"heinz@example.org";"<dfhglkjdgjkhnwrd@mail.gmail.com>";"2010-07-12 12:05:34";200;"name is mail address";"42#5"
"Hans";"hans1@example.org";"<hans1@mail.gmail.com>";"2010-07-12 12:05:40";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans2@mail.gmail.com>";"2010-07-12 12:05:41";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans3@mail.gmail.com>";"2010-07-12 12:05:42";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans4@mail.gmail.com>";"2010-07-12 12:05:43";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans5@mail.gmail.com>";"2010-07-12 12:05:44";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans6@mail.gmail.com>";"2010-07-12 12:05:45";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans7@mail.gmail.com>";"2010-07-12 12:05:46";200;"name is mail address";"42#7"
"Thomas";"thomas@example.org";"<saf54sd4gfasf46asf46@mail.gmail.com>";"";0;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= 2";"13#8"
"Björn";"bjoern@example.org";"<4cbaa9ef0802201124v37f1eec8g89a412dfbfc8383a@mail.gmail.com>";"2016-07-12 15:58:40";0;"Re: busybox 1";"13#8"
"Olaf";"olaf@example.org";"<6784529b0802032245r5164f984l342f0f0dc94aa420@mail.gmail.com>";"2016-07-12 15:58:50";-400;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= tab";"13#8"
"Thomas";"thomas@example.org";"<65a1sf31sagd684dfv31@mail.gmail.com>";"2016-07-12 16:04:40";100;"Re: Fw: busybox 2 tab";"13#9"
"Olaf";"olaf@example.org";"<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>";"2016-07-12 16:05:37";200;"Re: Fw: busybox 10";"13#9"
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
"Björn";"bjoern@example.org";"<adgkljsdfhkwafdkbhjasfcjn@mail.gmail.com>";"2004-10-09 18:38:13";200;"Re: Fw: busybox 202 with tab";1
"Björn";"bjoern@example.org";"<1107974989.17910.6.camel@jmcmullan>";"2005-02-09 18:49:49";-500;"Doubled date";2
"udo";"udo@example.org";"<asddghdswqeasdasd@mail.gmail.com>";"2010-07-12 10:05:36";200;"Only mail address";3
"Fritz fritz@example.org";"asd@sample.org";"<jlkjsdgihwkfjnvbjwkrbnwe@mail.gmail.com>";"2010-07-12 11:05:35";200;"name is mail address";4
"georg";"heinz@example.org";"<dfhglkjdgjkhnwrd@mail.gmail.com>";"2010-07-12 12:05:34";200;"name is mail address";5
"Hans";"hans1@example.org";"<hans1@mail.gmail.com>";"2010-07-12 12:05:40";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans2@mail.gmail.com>";"2010-07-12 12:05:41";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans3@mail.gmail.com>";"2010-07-12 12:05:42";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans4@mail.gmail.com>";"2010-07-12 12:05:43";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans5@mail.gmail.com>";"2010-07-12 12:05:44";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans6@mail.gmail.com>";"2010-07-12 12:05:45";200;"name is mail address";6
"Hans";"hans1@example.org";"<hans7@mail.gmail.com>";"2010-07-12 12:05:46";200;"name is mail address";7
"Thomas";"thomas@example.org";"<saf54sd4gfasf46asf46@mail.gmail.com>";"";0;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= 2";8
"Björn";"bjoern@example.org";"<4cbaa9ef0802201124v37f1eec8g89a412dfbfc8383a@mail.gmail.com>";"2016-07-12 15:58:40";0;"Re: busybox 1";8
"Olaf";"olaf@example.org";"<6784529b0802032245r5164f984l342f0f0dc94aa420@mail.gmail.com>";"2016-07-12 15:58:50";-400;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= tab";8
"Thomas";"thomas@example.org";"<65a1sf31sagd684dfv31@mail.gmail.com>";"2016-07-12 16:04:40";100;"Re: Fw: busybox 2 tab";9
"Olaf";"olaf@example.org";"<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>";"2016-07-12 16:05:37";200;"Re: Fw: busybox 10";9
"Björn";"bjoern@example.org";"<adgkljsdfhkwafdkbhjasfcjn@mail.gmail.com>";"2004-10-09 18:38:13";200;"Re: Fw: busybox 202 with tab";"13#1"
"Björn";"bjoern@example.org";"<1107974989.17910.6.camel@jmcmullan>";"2005-02-09 18:49:49";-500;"Doubled date";"42#2"
"udo";"udo@example.org";"<asddghdswqeasdasd@mail.gmail.com>";"2010-07-12 10:05:36";200;"Only mail address";"13#3"
"Fritz fritz@example.org";"asd@sample.org";"<jlkjsdgihwkfjnvbjwkrbnwe@mail.gmail.com>";"2010-07-12 11:05:35";200;"name is mail address";"42#4"
"georg";"heinz@example.org";"<dfhglkjdgjkhnwrd@mail.gmail.com>";"2010-07-12 12:05:34";200;"name is mail address";"42#5"
"Hans";"hans1@example.org";"<hans1@mail.gmail.com>";"2010-07-12 12:05:40";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans2@mail.gmail.com>";"2010-07-12 12:05:41";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans3@mail.gmail.com>";"2010-07-12 12:05:42";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans4@mail.gmail.com>";"2010-07-12 12:05:43";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans5@mail.gmail.com>";"2010-07-12 12:05:44";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans6@mail.gmail.com>";"2010-07-12 12:05:45";200;"name is mail address";"42#6"
"Hans";"hans1@example.org";"<hans7@mail.gmail.com>";"2010-07-12 12:05:46";200;"name is mail address";"42#7"
"Thomas";"thomas@example.org";"<saf54sd4gfasf46asf46@mail.gmail.com>";"";0;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= 2";"13#8"
"Björn";"bjoern@example.org";"<4cbaa9ef0802201124v37f1eec8g89a412dfbfc8383a@mail.gmail.com>";"2016-07-12 15:58:40";0;"Re: busybox 1";"13#8"
"Olaf";"olaf@example.org";"<6784529b0802032245r5164f984l342f0f0dc94aa420@mail.gmail.com>";"2016-07-12 15:58:50";-400;"=?KOI8-R?Q?=EF=D4=D7=C5=D4:_Some_patches?= tab";"13#8"
"Thomas";"thomas@example.org";"<65a1sf31sagd684dfv31@mail.gmail.com>";"2016-07-12 16:04:40";100;"Re: Fw: busybox 2 tab";"13#9"
"Olaf";"olaf@example.org";"<9b06e8d20801220234h659c18a3g95c12ac38248c7e0@mail.gmail.com>";"2016-07-12 16:05:37";200;"Re: Fw: busybox 10";"13#9"
2 changes: 1 addition & 1 deletion tests/test-data-cut.R
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ test_that("Cut commit and mail data to same date range.", {
date = get.date.from.string(c("2016-07-12 16:04:40", "2016-07-12 16:05:37")),
date.offset = as.integer(c(100, 200)),
subject = c("Re: Fw: busybox 2 tab", "Re: Fw: busybox 10"),
thread = sprintf("<thread-%s>", c(9, 9)),
thread = sprintf("<thread-%s>", c("13#9", "13#9")),
artifact.type = c("Mail", "Mail"))

commit.data = x.data$get.data.cut.to.same.date(data.sources = data.sources)$get.commits.unfiltered()
Expand Down
Loading

0 comments on commit 68c0515

Please sign in to comment.