Chapter 9 MI500: Multi-path exploration results
Here we present the results for the best performances and activation gene coverage generated by each selection scheme replicate on the multi-path exploration diagnostic. Best performance found refers to the largest average trait score found in a given population. Note that activation gene coverage values are gathered at the population-level. Activation gene coverage refers to the count of unique activation genes in a given population; this gives us a range of integers between 0 and 100.
9.2 Truncation selection
Here we analyze how the different population structures affect truncation selection (size 8) on the contradictory objectives diagnostic.
9.2.1 Performance
9.2.1.1 Performance over time
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_fit_max) / DIMENSIONALITY,
mean = mean(pop_fit_max) / DIMENSIONALITY,
max = max(pop_fit_max) / DIMENSIONALITY
)
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 2.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle("Performance over time") +
p_theme
9.2.1.2 Best performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & VAR == 'pop_fit_max') %>%
ggplot(., aes(x = Structure, y = VAL / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Best performance')+
p_theme + coord_flip()
9.2.1.2.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & VAR == 'pop_fit_max')
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(VAL)),
min = min(VAL, na.rm = TRUE) / DIMENSIONALITY,
median = median(VAL, na.rm = TRUE) / DIMENSIONALITY,
mean = mean(VAL, na.rm = TRUE) / DIMENSIONALITY,
max = max(VAL, na.rm = TRUE) / DIMENSIONALITY,
IQR = IQR(VAL, na.rm = TRUE) / DIMENSIONALITY
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 5 58.0 57.0 100. 40.5
## 2 IS 100 0 11 56.0 58.3 99.9 44.5
## 3 NMIS 100 0 22.0 85.9 81.5 99.9 22.4
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: VAL by Structure
## Kruskal-Wallis chi-squared = 57.688, df = 2, p-value = 2.973e-13
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$VAL, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$VAL and performance$Structure
##
## EA IS
## IS 1 -
## NMIS 4.3e-11 1.3e-10
##
## P value adjustment method: bonferroni
9.2.1.3 Final performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_fit_max / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final performance')+
p_theme + coord_flip()
9.2.1.3.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & Generations == 50000)
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_fit_max)),
min = min(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
median = median(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
mean = mean(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
max = max(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
IQR = IQR(pop_fit_max / DIMENSIONALITY, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 5 58.0 57.0 100. 40.5
## 2 IS 100 0 11 56.0 58.3 99.9 44.5
## 3 NMIS 100 0 22.0 85.9 81.5 99.9 22.4
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: pop_fit_max by Structure
## Kruskal-Wallis chi-squared = 57.688, df = 2, p-value = 2.973e-13
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$pop_fit_max, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$pop_fit_max and performance$Structure
##
## EA IS
## IS 1 -
## NMIS 4.3e-11 1.3e-10
##
## P value adjustment method: bonferroni
9.2.2 Generation satisfactory solution found
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_ssf, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION'& Generations <= GENERATIONS) %>%
ggplot(., aes(x = Structure, y = Generations, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_shape_manual(values=SHAPE)+
scale_y_continuous(
name="Generations"
) +
scale_x_discrete(
name="Structure"
) +
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
p_theme + coord_flip()
9.2.2.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
ssf = filter(base_ssf, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & Generations < 60000)
ssf %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(Generations)),
min = min(Generations, na.rm = TRUE),
median = median(Generations, na.rm = TRUE),
mean = mean(Generations, na.rm = TRUE),
max = max(Generations, na.rm = TRUE),
IQR = IQR(Generations, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl>
## 1 EA 1 0 15300 15300 15300 15300 0
## 2 IS 2 0 26492 26654 26654 26816 162
## 3 NMIS 5 0 26188 28563 28313. 29384 372
Kruskal–Wallis test provides evidence of no difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: Generations by Structure
## Kruskal-Wallis chi-squared = 3.3833, df = 2, p-value = 0.1842
9.2.3 Activation gene coverage
Activation gene coverage analysis.
9.2.3.1 Coverage over time
Activation gene coverage over time.
# data for lines and shading on plots
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_act_cov),
mean = mean(pop_act_cov),
max = max(pop_act_cov)
)## `summarise()` has grouped output by 'Structure'. You can override using the
## `.groups` argument.
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 1.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Coverage"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Activation gene coverage over time')+
p_theme
9.2.3.2 End of 50,000 generations
Activation gene coverage in the population at the end of 50,000 generations.
### end of run
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_act_cov, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.3) +
geom_point(position = position_jitter(height = .05, width = .05), size = 1.5, alpha = 0.5) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_shape_manual(values=SHAPE)+
scale_y_continuous(
name="Coverage"
) +
scale_x_discrete(
name="Structure"
) +
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final activation gene coverage')+
p_theme + coord_flip()
9.2.3.2.1 Stats
Summary statistics for activation gene coverage.
coverage = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TRUNCATION' & Generations == 50000)
coverage %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_act_cov)),
min = min(pop_act_cov, na.rm = TRUE),
median = median(pop_act_cov, na.rm = TRUE),
mean = mean(pop_act_cov, na.rm = TRUE),
max = max(pop_act_cov, na.rm = TRUE),
IQR = IQR(pop_act_cov, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl>
## 1 EA 100 0 1 2 1.96 3 0
## 2 IS 100 0 1 2 2.01 3 0
## 3 NMIS 100 0 4 6 6.38 8 1
Kruskal–Wallis test provides evidence of difference among activation gene coverage.
##
## Kruskal-Wallis rank sum test
##
## data: pop_act_cov by Structure
## Kruskal-Wallis chi-squared = 258.93, df = 2, p-value < 2.2e-16
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction on activation gene coverage.
pairwise.wilcox.test(x = coverage$pop_act_cov, g = coverage$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: coverage$pop_act_cov and coverage$Structure
##
## EA IS
## IS 0.34 -
## NMIS <2e-16 <2e-16
##
## P value adjustment method: bonferroni
9.3 Tournament selection
Here we analyze how the different population structures affect tournament selection (size 8) on the contradictory objectives diagnostic.
9.3.1 Performance
9.3.1.1 Performance over time
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_fit_max) / DIMENSIONALITY,
mean = mean(pop_fit_max) / DIMENSIONALITY,
max = max(pop_fit_max) / DIMENSIONALITY
)
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 2.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle("Performance over time") +
p_theme
9.3.1.2 Best performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & VAR == 'pop_fit_max') %>%
ggplot(., aes(x = Structure, y = VAL / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Best performance')+
p_theme + coord_flip()
9.3.1.2.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & VAR == 'pop_fit_max')
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(VAL)),
min = min(VAL, na.rm = TRUE) / DIMENSIONALITY,
median = median(VAL, na.rm = TRUE) / DIMENSIONALITY,
mean = mean(VAL, na.rm = TRUE) / DIMENSIONALITY,
max = max(VAL, na.rm = TRUE) / DIMENSIONALITY,
IQR = IQR(VAL, na.rm = TRUE) / DIMENSIONALITY
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 5 60.0 57.5 99.9 45.0
## 2 IS 100 0 12 59.0 57.1 99.9 43.5
## 3 NMIS 100 0 37.0 85.9 81.2 99.8 23.1
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: VAL by Structure
## Kruskal-Wallis chi-squared = 52.543, df = 2, p-value = 3.895e-12
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$VAL, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$VAL and performance$Structure
##
## EA IS
## IS 1 -
## NMIS 5.9e-09 5.3e-11
##
## P value adjustment method: bonferroni
9.3.1.3 Final performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_fit_max / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final performance')+
p_theme + coord_flip()
9.3.1.3.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & Generations == 50000)
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_fit_max)),
min = min(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
median = median(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
mean = mean(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
max = max(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
IQR = IQR(pop_fit_max / DIMENSIONALITY, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 5 60.0 57.5 99.9 45.0
## 2 IS 100 0 12 59.0 57.1 99.9 43.5
## 3 NMIS 100 0 37.0 85.9 81.2 99.8 23.1
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: pop_fit_max by Structure
## Kruskal-Wallis chi-squared = 52.543, df = 2, p-value = 3.895e-12
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$pop_fit_max, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$pop_fit_max and performance$Structure
##
## EA IS
## IS 1 -
## NMIS 5.9e-09 5.3e-11
##
## P value adjustment method: bonferroni
9.3.2 Generation satisfactory solution found
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_ssf, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT'& Generations <= GENERATIONS) %>%
ggplot(., aes(x = Structure, y = Generations, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_shape_manual(values=SHAPE)+
scale_y_continuous(
name="Generations"
) +
scale_x_discrete(
name="Structure"
) +
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
p_theme + coord_flip()
9.3.2.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
ssf = filter(base_ssf, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & Generations < 60000)
ssf %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(Generations)),
min = min(Generations, na.rm = TRUE),
median = median(Generations, na.rm = TRUE),
mean = mean(Generations, na.rm = TRUE),
max = max(Generations, na.rm = TRUE),
IQR = IQR(Generations, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl>
## 1 EA 6 0 25843 26598. 26813 27721 954
## 2 IS 3 0 33462 34801 34458. 35112 825
## 3 NMIS 8 0 34401 36612. 36496. 38154 989.
Kruskal–Wallis test provides evidence of no difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: Generations by Structure
## Kruskal-Wallis chi-squared = 12.797, df = 2, p-value = 0.001664
pairwise.wilcox.test(x = ssf$Generations, g = ssf$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum exact test
##
## data: ssf$Generations and ssf$Structure
##
## EA IS
## IS 0.036 -
## NMIS 0.001 0.073
##
## P value adjustment method: bonferroni
9.3.3 Activation gene coverage
Activation gene coverage analysis.
9.3.3.1 Coverage over time
Activation gene coverage over time.
# data for lines and shading on plots
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_act_cov),
mean = mean(pop_act_cov),
max = max(pop_act_cov)
)## `summarise()` has grouped output by 'Structure'. You can override using the
## `.groups` argument.
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 1.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Coverage"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Activation gene coverage over time')+
p_theme
9.3.3.2 End of 50,000 generations
Activation gene coverage in the population at the end of 50,000 generations.
### end of run
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_act_cov, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.3) +
geom_point(position = position_jitter(height = .05, width = .05), size = 1.5, alpha = 0.5) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_shape_manual(values=SHAPE)+
scale_y_continuous(
name="Coverage"
) +
scale_x_discrete(
name="Structure"
) +
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final activation gene coverage')+
p_theme + coord_flip()
9.3.3.2.1 Stats
Summary statistics for activation gene coverage.
coverage = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'TOURNAMENT' & Generations == 50000)
coverage %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_act_cov)),
min = min(pop_act_cov, na.rm = TRUE),
median = median(pop_act_cov, na.rm = TRUE),
mean = mean(pop_act_cov, na.rm = TRUE),
max = max(pop_act_cov, na.rm = TRUE),
IQR = IQR(pop_act_cov, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl>
## 1 EA 100 0 1 2 1.96 3 0
## 2 IS 100 0 1 2 2.05 3 0
## 3 NMIS 100 0 3 6 6.22 8 1.25
Kruskal–Wallis test provides evidence of difference among activation gene coverage.
##
## Kruskal-Wallis rank sum test
##
## data: pop_act_cov by Structure
## Kruskal-Wallis chi-squared = 264.53, df = 2, p-value < 2.2e-16
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction on activation gene coverage.
pairwise.wilcox.test(x = coverage$pop_act_cov, g = coverage$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'g')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: coverage$pop_act_cov and coverage$Structure
##
## EA IS
## IS 0.019 -
## NMIS <2e-16 <2e-16
##
## P value adjustment method: bonferroni
9.4 Lexicase selection
Here we analyze how the different population structures affect standard lexicase selection on the contradictory objectives diagnostic.
9.4.1 Performance
9.4.1.1 Performance over time
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_fit_max) / DIMENSIONALITY,
mean = mean(pop_fit_max) / DIMENSIONALITY,
max = max(pop_fit_max) / DIMENSIONALITY
)
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 2.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle("Performance over time") +
p_theme
9.4.1.2 Best performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & VAR == 'pop_fit_max') %>%
ggplot(., aes(x = Structure, y = VAL / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Best performance')+
p_theme + coord_flip()
9.4.1.2.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_best, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & VAR == 'pop_fit_max')
performance$Structure = factor(performance$Structure, levels=c('EA','NMIS','IS'))
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(VAL)),
min = min(VAL, na.rm = TRUE) / DIMENSIONALITY,
median = median(VAL, na.rm = TRUE) / DIMENSIONALITY,
mean = mean(VAL, na.rm = TRUE) / DIMENSIONALITY,
max = max(VAL, na.rm = TRUE) / DIMENSIONALITY,
IQR = IQR(VAL, na.rm = TRUE) / DIMENSIONALITY
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 83.4 93.2 92.8 98.4 4.80
## 2 NMIS 100 0 66.3 75.9 76.1 86.4 5.66
## 3 IS 100 0 61.0 73.9 74.1 87.4 7.42
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: VAL by Structure
## Kruskal-Wallis chi-squared = 202.16, df = 2, p-value < 2.2e-16
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$VAL, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'l')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$VAL and performance$Structure
##
## EA NMIS
## NMIS <2e-16 -
## IS <2e-16 0.0032
##
## P value adjustment method: bonferroni
9.4.1.3 Final performance
First generation a satisfactory solution is found throughout the 50,000 generations.
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_fit_max / DIMENSIONALITY, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.2) +
geom_point(position = position_jitter(width = .1), size = 1.5, alpha = 1.0) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_y_continuous(
name="Average trait score"
) +
scale_x_discrete(
name="Structure"
)+
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette, ) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final performance')+
p_theme + coord_flip()
9.4.1.3.1 Stats
Summary statistics for the first generation a satisfactory solution is found.
performance = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & Generations == 50000)
performance$Structure = factor(performance$Structure, levels=c('EA','NMIS','IS'))
performance %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_fit_max)),
min = min(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
median = median(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
mean = mean(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
max = max(pop_fit_max / DIMENSIONALITY, na.rm = TRUE),
IQR = IQR(pop_fit_max / DIMENSIONALITY, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 EA 100 0 74.4 91.3 90.6 97.2 6.69
## 2 NMIS 100 0 64.4 73.9 73.8 83.8 5.84
## 3 IS 100 0 57.7 69.5 70.6 87.4 8.30
Kruskal–Wallis test provides evidence of difference among selection schemes.
##
## Kruskal-Wallis rank sum test
##
## data: pop_fit_max by Structure
## Kruskal-Wallis chi-squared = 198.85, df = 2, p-value < 2.2e-16
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction.
pairwise.wilcox.test(x = performance$pop_fit_max, g = performance$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'l')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: performance$pop_fit_max and performance$Structure
##
## EA NMIS
## NMIS < 2e-16 -
## IS < 2e-16 1.6e-05
##
## P value adjustment method: bonferroni
9.4.2 Activation gene coverage
Activation gene coverage analysis.
9.4.2.1 Coverage over time
Activation gene coverage over time.
# data for lines and shading on plots
lines = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE') %>%
group_by(Structure, Generations) %>%
dplyr::summarise(
min = min(pop_act_cov),
mean = mean(pop_act_cov),
max = max(pop_act_cov)
)## `summarise()` has grouped output by 'Structure'. You can override using the
## `.groups` argument.
ggplot(lines, aes(x=Generations, y=mean, group = Structure, fill = Structure, color = Structure, shape = Structure)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.1) +
geom_line(size = 0.5) +
geom_point(data = filter(lines, Generations %% 2000 == 0), size = 1.5, stroke = 2.0, alpha = 1.0) +
scale_y_continuous(
name="Coverage"
) +
scale_x_continuous(
name="Generations",
limits=c(0, 50000),
breaks=c(0, 10000, 20000, 30000, 40000, 50000),
labels=c("0e+4", "1e+4", "2e+4", "3e+4", "4e+4", "5e+4")
) +
scale_shape_manual(values=SHAPE)+
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Activation gene coverage over time')+
p_theme
9.4.2.2 End of 50,000 generations
Activation gene coverage in the population at the end of 50,000 generations.
### end of run
filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & Generations == 50000) %>%
ggplot(., aes(x = Structure, y = pop_act_cov, color = Structure, fill = Structure, shape = Structure)) +
geom_flat_violin(position = position_nudge(x = .2, y = 0), scale = 'width', alpha = 0.3) +
geom_point(position = position_jitter(height = .05, width = .05), size = 1.5, alpha = 0.5) +
geom_boxplot(color = 'black', width = .2, outlier.shape = NA, alpha = 0.0) +
scale_shape_manual(values=SHAPE)+
scale_y_continuous(
name="Coverage"
) +
scale_x_discrete(
name="Structure"
) +
scale_colour_manual(values = cb_palette) +
scale_fill_manual(values = cb_palette) +
ggtitle('Final activation gene coverage')+
p_theme + coord_flip()
9.4.2.2.1 Stats
Summary statistics for activation gene coverage.
coverage = filter(base_over_time, Diagnostic == 'MULTIPATH_EXPLORATION' & `Selection\nScheme` == 'LEXICASE' & Generations == 50000)
coverage$Structure = factor(coverage$Structure, levels=c('EA','NMIS','IS'))
coverage %>%
group_by(Structure) %>%
dplyr::summarise(
count = n(),
na_cnt = sum(is.na(pop_act_cov)),
min = min(pop_act_cov, na.rm = TRUE),
median = median(pop_act_cov, na.rm = TRUE),
mean = mean(pop_act_cov, na.rm = TRUE),
max = max(pop_act_cov, na.rm = TRUE),
IQR = IQR(pop_act_cov, na.rm = TRUE)
)## # A tibble: 3 x 8
## Structure count na_cnt min median mean max IQR
## <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl>
## 1 EA 100 0 24 31 31.2 41 5
## 2 NMIS 100 0 24 30 30.3 44 4
## 3 IS 100 0 12 17 17.3 26 3
Kruskal–Wallis test provides evidence of difference among activation gene coverage.
##
## Kruskal-Wallis rank sum test
##
## data: pop_act_cov by Structure
## Kruskal-Wallis chi-squared = 201.31, df = 2, p-value < 2.2e-16
Results for post-hoc Wilcoxon rank-sum test with a Bonferroni correction on activation gene coverage.
pairwise.wilcox.test(x = coverage$pop_act_cov, g = coverage$Structure, p.adjust.method = "bonferroni",
paired = FALSE, conf.int = FALSE, alternative = 'l')##
## Pairwise comparisons using Wilcoxon rank sum test with continuity correction
##
## data: coverage$pop_act_cov and coverage$Structure
##
## EA NMIS
## NMIS 0.077 -
## IS <2e-16 <2e-16
##
## P value adjustment method: bonferroni