Title: | Generative Mechanism Estimation in Temporal Complex Networks |
---|---|
Description: | Statistical methods for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks are provided. Thong Pham et al. (2015) <doi:10.1371/journal.pone.0137796>. Thong Pham et al. (2016) <doi:10.1038/srep32558>. Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>. Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>. |
Authors: | Thong Pham, Paul Sheridan, Hidetoshi Shimodaira |
Maintainer: | Thong Pham <[email protected]> |
License: | GPL-3 |
Version: | 1.2.10 |
Built: | 2024-10-25 03:00:42 UTC |
Source: | https://github.com/thongphamthe/pafit |
A package for estimating preferential attachment and node fitness generative mechanisms in temporal complex networks. References: Thong Pham et al. (2015) <10.1371/journal.pone.0137796>, Thong Pham et al. (2016) <doi:10.1038/srep32558>, Thong Pham et al. (2020) <doi:10.18637/jss.v092.i03>, Thong Pham et al. (2021) <doi:10.1093/comnet/cnab024>.
Package: | PAFit |
Type: | Package |
Version: | 1.2.10 |
Authors: | Thong Pham, Paul Sheridan, Hidetoshi Shimodaira |
Maintainer: | Thong Pham [email protected] |
Date: | 2024-03-28 |
License: | GPL-3 |
The PAFit package provides a comprehensive framework to deal with growth mechanisms of temporal complex networks. In particular, it implements functions to simulate various temporal network models, gather essential network statistics from raw input data, and use these summarized statistics in the estimation of the attachment function and node fitnesses
. The heavy computational parts of the package are implemented in
C++
through the use of the Rcpp package. Furthermore, users with a multi-core machine can enjoy a hassle-free speed up through OpenMP parallelization mechanisms implemented in the code. Apart from the main functions, the package also includes a real-world collaboration network dataset between scientists in the field of complex networks (coauthor.net
). The main package functionalities are as follows.
Firstly, most well-known temporal network models based on the preferential attachment (PA) and node fitness mechanisms can be easily simulated using the package. PAFit implements generate_BA
for the Barabási-Albert (BA) model, generate_ER
for the growing Erdős–Rényi (ER) model, generate_BB
for the Bianconi-Barabási (BB) model and generate_fit_only
for the Caldarelli model. These functions have many customizable options, for example the number of new edges at each time-step are tunable stochastic variables. They are actually wrappers of the more powerful generate_net
function, which simulates networks with more flexible attachment function and node fitness settings.
Secondly, the function get_statistics
efficiently collects all temporal network summary statistics. We note that get_statistics
automatically handles both directed and undirected networks. It returns a list containing many statistics that can be used to characterize the network growth process. Notable fields are m_tk
containing the number of new edges that connect to a degree- node at time-step
, and
node_degree
containing the degree sequence, i.e., the degree of each node at each time-step.
The most important functionality of the package is estimating the attachment function and node fitnesses of a temporal network. This is implemented through various methods. There are three usages: estimation of the attachment function in isolation, estimation of the node fitnesses in isolation, and the joint estimation of the attachment function and node fitnesses.
The functions for estimating the attachment function in isolation are: Jeong
for Jeong's method (Ref. 1), Newman
for Newman's method (Ref. 2), and only_A_estimate
for the PAFit method (Ref. 3).
For estimation of node fitnesses in isolation, only_F_estimate
implements a variant of the PAFit method (Ref. 4).
For the joint estimation of the attachment function and node fitnesses, we implement the full version of the PAFit method in joint_estimate
(Ref. 4).
For estimating the nonparametric attachment function from a single snapshot, use PAFit_oneshot
(Ref. 6).
Excluding PAFit_oneshot
, the input of the remaining functions is the output object of the function get_statistics
. The output object of these functions contains the estimation results as well as some additional information pertaining to the estimation process. The estimated attachment function and/or node fitnesses can be plotted by using the plot
command directly on this output object. This will visualize not only the estimated results but also the remaining uncertainties when possible.
Thong Pham [email protected], Paul Sheridan, and Hidetoshi Shimodaira.
1. Jeong, H., Néda, Z. & Barabási, A. (2003). Measuring Preferential Attachment in Evolving Networks. Europhysics Letters 61(61):567-572. (doi:10.1209/epl/i2003-00166-9).
2. Newman, M. (2001). Clustering and Preferential Attachment in Growing Networks. Physical Review E 64(2):025102. (doi:10.1103/PhysRevE.64.025102).
3. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLOS ONE 10(9):e0137796. (doi:10.1371/journal.pone.0137796).
4. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
5. Pham, T., Sheridan, P. & Shimodaira, H. (2020). PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks. Journal of Statistical Software 92 (3). (doi:10.18637/jss.v092.i03)
6. Pham, T., Sheridan, P. & Shimodaira, H. (2021). Non-parametric estimation of the preferential attachment function from one network snapshot. Journal of Complex Networks 9(5): cnab024. (doi:10.1093/comnet/cnab024).
See the accompanying vignette for a tutorial.
See also the GitHub page.
## Not run: ### Jointly estimate the attachment function and node fitnesses library("PAFit") set.seed(1) # a Bianconi-Barabasi network # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of distribution of fitness: s = 10 net <- generate_BB(N = 1000 , m = 10 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) #Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- pmax(result$estimate_result$center_k,1) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot distribution of estimated node fitnesses plot(result, net_stats, plot = "f") #plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
## Not run: ### Jointly estimate the attachment function and node fitnesses library("PAFit") set.seed(1) # a Bianconi-Barabasi network # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of distribution of fitness: s = 10 net <- generate_BB(N = 1000 , m = 10 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) #Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- pmax(result$estimate_result$center_k,1) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot distribution of estimated node fitnesses plot(result, net_stats, plot = "f") #plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
This function converts a graph stored in an edgelist matrix format to a PAFit_net
object.
as.PAFit_net(graph, type = "directed", PA = NULL, fitness = NULL)
as.PAFit_net(graph, type = "directed", PA = NULL, fitness = NULL)
graph |
An edgelist matrix. Each row is assumed to be of the form ( To register a new node
|
type |
String. Indicates whether the network is |
PA |
Numeric vector. Contains the PA function. Default value is |
fitness |
Numeric vector. Contains node fitnesses. Default value is |
An object of class PAFit_net
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) as.PAFit_net(net$graph)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) as.PAFit_net(net$graph)
The dataset is collaboration network of authors of network science articles with article time-stamps. An edge between two authors represents an article in common. Time stamps denote article publication dates. The network without time-stamps was compiled by Mark Newman in May 2006 from the bibliographies of two review articles on networks, M. E. J. Newman, SIAM Review 45, 167-256 (2003) and S. Boccaletti et al., Physics Reports 424, 175-308 (2006), with a few additional references added by hand. Paul Sheridan independently supplemented the network with time-stamps and some basic metadata in June 2015. The network is undirected with monthly resolution, and contains no duplicated edges. coauthor.net
contains the network. coauthor.truetime
contains the real times of processed time-stamps. Finally coauthor.author_id
contains author names.
Reference: M. E. J. Newman, Finding community structure in networks using the eigenvectors of matrices, Preprint physics/0605087 (2006).
data(ComplexNetCoauthor)
data(ComplexNetCoauthor)
coauthor.net
is a matrix with 2849 rows and 3 columns. Each row is an edge with the format (author id 1, author id 2, time_stamp). coauthor.truetime
is a two-column matrix whose each row is (time_stamp, real time). coauthor.author_id
is a two-column matrix whose each row is (author id, author name).
https://www.paulsheridan.net/files/collabnet.zip
This function converts an igraph
object (of package igraph) to a PAFit_net
object.
from_igraph(net)
from_igraph(net)
net |
An object of class |
The function returns a PAFit_net
object.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) igraph_graph <- to_igraph(net) back <- from_igraph(igraph_graph)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) igraph_graph <- to_igraph(net) back <- from_igraph(igraph_graph)
This function converts a networkDynamic
object (of package networkDynamic) to a PAFit_net
object.
from_networkDynamic(net)
from_networkDynamic(net)
net |
An object of class |
The function returns a PAFit_net
object.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) nD_graph <- to_networkDynamic(net) back <- from_networkDynamic(nD_graph)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) nD_graph <- to_networkDynamic(net) back <- from_networkDynamic(nD_graph)
This function generates networks from the generalized Barabási-Albert model. In this model, the preferential attachment function is power-law, i.e. , and node fitnesses are all equal to
. It is a wrapper of the more powerful function
generate_net
.
generate_BA(N = 1000, num_seed = 2 , multiple_node = 1 , m = 1 , alpha = 1)
generate_BA(N = 1000, num_seed = 2 , multiple_node = 1 , m = 1 , alpha = 1)
N |
Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is |
num_seed |
Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is |
multiple_node |
Positive integer. The number of new nodes at each time-step. Default value is |
m |
Positive integer. The number of edges of each new node. Default value is |
alpha |
Numeric. This is the attachment exponent in the attachment function |
The output is a PAFit_net
object, which is a List contains the following four fields:
graph |
a three-column matrix, where each row contains information of one edge, in the form of |
type |
a string indicates whether the network is |
PA |
a numeric vector contains the true PA function. |
fitness |
fitness values of nodes in the network. The fitnesses are all equal to |
Thong Pham [email protected]
1. Albert, R. & Barabási, A. (1999). Emergence of scaling in random networks. Science, 286,509–512 (https://www.science.org/doi/10.1126/science.286.5439.509).
For subsequent estimation procedures, see get_statistics
.
For other functions to generate networks, see generate_net
, generate_ER
, generate_BB
and generate_fit_only
.
library("PAFit") # generate a network from the BA model with alpha = 1, N = 100, m = 1 net <- generate_BA(N = 100) str(net) plot(net)
library("PAFit") # generate a network from the BA model with alpha = 1, N = 100, m = 1 net <- generate_BA(N = 100) str(net) plot(net)
This function generates networks from the Bianconi-Barabási model. It is a ‘preferential attachment with fitness’ model. In this model, the preferential attachment function is linear, i.e. , and node fitnesses are sampled from some probability distribution.
generate_BB(N = 1000 , num_seed = 2 , multiple_node = 1 , m = 1 , mode_f = "gamma", s = 10 )
generate_BB(N = 1000 , num_seed = 2 , multiple_node = 1 , m = 1 , mode_f = "gamma", s = 10 )
The parameters can be divided into two groups.
The first group specifies basic properties of the network:
N |
Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is |
num_seed |
Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is |
multiple_node |
Positive integer. The number of new nodes at each time-step. Default value is |
m |
Positive integer. The number of edges of each new node. Default value is |
The final group of parameters specifies the distribution from which node fitnesses are generated:
mode_f |
String. Possible values: |
s |
Non-negative numeric. The inverse variance parameter. The mean of the distribution is kept at |
The output is a PAFit_net
object, which is a List contains the following four fields:
graph |
a three-column matrix, where each row contains information of one edge, in the form of |
type |
a string indicates whether the network is |
PA |
a numeric vector contains the true PA function. |
fitness |
fitness values of nodes in the network. The name of each value is the ID of the node. |
Thong Pham [email protected]
1. Bianconni, G. & Barabási, A. (2001). Competition and multiscaling in evolving networks. Europhys. Lett., 54, 436 (doi:10.1209/epl/i2001-00260-6).
For subsequent estimation procedures, see get_statistics
.
For other functions to generate networks, see generate_net
, generate_BA
, generate_ER
and generate_fit_only
.
library("PAFit") # generate a network from the BB model with alpha = 1, N = 100, m = 1 # The inverse variance of the Gamma distribution of node fitnesses is s = 10 net <- generate_BB(N = 100,m = 1,mode = 1, s = 10) str(net) plot(net)
library("PAFit") # generate a network from the BB model with alpha = 1, N = 100, m = 1 # The inverse variance of the Gamma distribution of node fitnesses is s = 10 net <- generate_BB(N = 100,m = 1,mode = 1, s = 10) str(net) plot(net)
This function generates networks from the Erdős–Rényi model. In this model, the preferential attachment function is a constant function, i.e. , and node fitnesses are all equal to
. It is a wrapper of the more powerful function
generate_net
.
generate_ER(N = 1000, num_seed = 2 , multiple_node = 1 , m = 1)
generate_ER(N = 1000, num_seed = 2 , multiple_node = 1 , m = 1)
N |
Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is |
num_seed |
Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is |
multiple_node |
Positive integer. The number of new nodes at each time-step. Default value is |
m |
Positive integer. The number of edges of each new node. Default value is |
The output is a PAFit_net
object, which is a List contains the following four fields:
graph |
a three-column matrix, where each row contains information of one edge, in the form of |
type |
a string indicates whether the network is |
PA |
a numeric vector contains the true PA function. |
fitness |
fitness values of nodes in the network. The fitnesses are all equal to |
Thong Pham [email protected]
1. Erdös P. & Rényi A.. On random graphs. Publicationes Mathematicae Debrecen. 1959;6:290–297 (https://snap.stanford.edu/class/cs224w-readings/erdos59random.pdf).
For subsequent estimation procedures, see get_statistics
.
For other functions to generate networks, see generate_net
, generate_BA
, generate_BB
and generate_fit_only
.
library("PAFit") # generate a network from the ER model with N = 1000 nodes net <- generate_ER(N = 1000) str(net) plot(net)
library("PAFit") # generate a network from the ER model with N = 1000 nodes net <- generate_ER(N = 1000) str(net) plot(net)
This function generates networks from the Caldarelli model. In this model, the preferential attachment function is constant, i.e. , and node fitnesses are sampled from some probability distribution.
generate_fit_only(N = 1000 , num_seed = 2 , multiple_node = 1 , m = 1 , mode_f = "gamma", s = 10 )
generate_fit_only(N = 1000 , num_seed = 2 , multiple_node = 1 , m = 1 , mode_f = "gamma", s = 10 )
The parameters can be divided into two groups.
The first group specifies basic properties of the network:
N |
Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is |
num_seed |
Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is |
multiple_node |
Positive integer. The number of new nodes at each time-step. Default value is |
m |
Positive integer. The number of edges of each new node. Default value is |
The final group of parameters specifies the distribution from which node fitnesses are generated:
mode_f |
String. Possible values: |
s |
Non-negative numeric. The inverse variance parameter. The mean of the distribution is kept at |
The output is a PAFit_net
object, which is a List contains the following four fields:
graph |
a three-column matrix, where each row contains information of one edge, in the form of |
type |
a string indicates whether the network is |
PA |
a numeric vector contains the true PA function. |
fitness |
fitness values of nodes in the network. The name of each value is the ID of the node. |
Thong Pham [email protected]
1. Caldarelli, G., Capocci, A. , De Los Rios, P. & Muñoz, M.A. (2002). Scale-Free Networks from Varying Vertex Intrinsic Fitness. Phys. Rev. Lett., 89, 258702 (doi:10.1103/PhysRevLett.89.258702).
For subsequent estimation procedures, see get_statistics
.
For other functions to generate networks, see generate_net
, generate_BA
, generate_ER
and generate_BB
.
library("PAFit") # generate a network from the Caldarelli model with alpha = 1, N = 100, m = 1 # the inverse variance of distribution of node fitnesses is s = 10 net <- generate_fit_only(N = 100,m = 1,mode = 1, s = 10) str(net) plot(net)
library("PAFit") # generate a network from the Caldarelli model with alpha = 1, N = 100, m = 1 # the inverse variance of distribution of node fitnesses is s = 10 net <- generate_fit_only(N = 100,m = 1,mode = 1, s = 10) str(net) plot(net)
This function generates networks from the General Temporal model, a generative temporal network model that includes many well-known models such as the Erdős–Rényi model, the Barabási-Albert model or the Bianconi-Barabási model as special cases. This function also includes some flexible mechanisms to vary the number of new nodes and new edges at each time-step in order to generate realistic networks.
generate_net (N = 1000 , num_seed = 2 , multiple_node = 1 , specific_start = NULL , m = 1 , prob_m = FALSE , increase = FALSE , log = FALSE , no_new_node_step = 0 , m_no_new_node_step = m , custom_PA = NULL , mode = 1 , alpha = 1 , beta = 2 , sat_at = 100 , offset = 1 , mode_f = "gamma", s = 10 )
generate_net (N = 1000 , num_seed = 2 , multiple_node = 1 , specific_start = NULL , m = 1 , prob_m = FALSE , increase = FALSE , log = FALSE , no_new_node_step = 0 , m_no_new_node_step = m , custom_PA = NULL , mode = 1 , alpha = 1 , beta = 2 , sat_at = 100 , offset = 1 , mode_f = "gamma", s = 10 )
The parameters can be divided into four groups.
The first group specifies basic properties of the network:
N |
Integer. Total number of nodes in the network (including the nodes in the seed graph). Default value is |
num_seed |
Integer. The number of nodes of the seed graph (the initial state of the network). The seed graph is a cycle. Default value is |
multiple_node |
Positive integer. The number of new nodes at each time-step. Default value is |
specific_start |
Positive Integer. If |
The second group specifies the number of new edges at each time-step:
m |
Positive integer. The number of edges of each new node. Default value is |
prob_m |
Logical. Indicates whether we fix the number of edges of each new node as a constant, or let it follows a Poisson distribution. If |
increase |
Logical. Indicates whether we increase the mean of the Poisson distribution over time. If |
log |
Logical. Indicates how to increase the mean of the Poisson distribution. If |
no_new_node_step |
Non-negative integer. The number of time-steps in which no new node is added, while new edges are added between existing nodes. Default value is |
m_no_new_node_step |
Positive integer. The number of new edges in the no-new-node steps. Default value is equal to |
The third group of parameters specifies the preferential attachment function:
custom_PA |
Numeric vector. This is the user-input PA function: |
mode |
Integer. Indicates the parametric attachment function to be used in generating the network. If |
alpha |
Numeric. If |
beta |
Numeric. This is the beta in the attachment function |
sat_at |
Integer. This is the saturation position |
offset |
Numeric. The attachment value of degree |
The final group of parameters specifies the distribution from which node fitnesses are generated:
mode_f |
String. Possible values: |
s |
Non-negative numeric. The inverse variance parameter. The mean of the distribution is kept at |
The output is a PAFit_net
object, which is a List contains the following four fields:
graph |
a three-column matrix, where each row contains information of one edge, in the form of |
type |
a string indicates whether the network is |
PA |
a numeric vector contains the true PA function. |
fitness |
fitness values of nodes in the network. The name of each value is the ID of the node. |
Thong Pham [email protected]
For subsequent estimation procedures, see get_statistics
.
For simpler functions to generate networks from well-known models, see generate_BA
, generate_ER
, generate_BB
and generate_fit_only
.
library("PAFit") #Generate a network from the original BA model with alpha = 1, N = 100, m = 1 net <- generate_net(N = 100,m = 1,mode = 1, alpha = 1, s = 0) str(net) plot(net)
library("PAFit") #Generate a network from the original BA model with alpha = 1, N = 100, m = 1 net <- generate_net(N = 100,m = 1,mode = 1, alpha = 1, s = 0) str(net) plot(net)
This function generates simulated networks from a fitted model and performs estimations on these simulated networks with the same setting used in the original estimation. Each simulated network is generated using parameters of the fitted model, while keeping other aspects of the growth process as faithfully as possible to the original observed network.
generate_simulated_data_from_estimated_model(net_object, net_stat, result, M = 5)
generate_simulated_data_from_estimated_model(net_object, net_stat, result, M = 5)
net_object |
an object of class |
net_stat |
An object of class |
result |
An object of class |
M |
integer. The number of simulated networks. Default value is |
Outputs a Simulated_Data_From_Fitted_Model
object, which is a list containing the following fields:
graph_list
: a list containing M
simulated graphs.
stats_list
: a list containing M
objects of class PAFit_data
, which are the results of applying get_statistics
on the simulated graphs.
result_list
: a list containing M
objects of class Full_PAFit_result
, which are the results of applying joint_estimate
on the simulated graphs.
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. (doi:10.1371/journal.pone.0137796).
2. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
3. Pham, T., Sheridan, P. & Shimodaira, H. (2020). PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks. Journal of Statistical Software 92 (3). (doi:10.18637/jss.v092.i03).
4. Inoue, M., Pham, T. & Shimodaira, H. (2020). Joint Estimation of Non-parametric Transitivity and Preferential Attachment Functions in Scientific Co-authorship Networks. Journal of Informetrics 14(3). (doi:10.1016/j.joi.2020.101042).
get_statistics
, joint_estimate
, plot_contribution
## Not run: library("PAFit") net_object <- generate_net(N = 500, m = 10, s = 10, alpha = 0.5) net_stat <- get_statistics(net_object) result <- joint_estimate(net_object, net_stat) simulated_data <- generate_simulated_data_from_estimated_model(net_object, net_stat, result) plot_contribution(simulated_data, result, which_plot = "PA") plot_contribution(simulated_data, result, which_plot = "fit") ## End(Not run)
## Not run: library("PAFit") net_object <- generate_net(N = 500, m = 10, s = 10, alpha = 0.5) net_stat <- get_statistics(net_object) result <- joint_estimate(net_object, net_stat) simulated_data <- generate_simulated_data_from_estimated_model(net_object, net_stat, result) plot_contribution(simulated_data, result, which_plot = "PA") plot_contribution(simulated_data, result, which_plot = "fit") ## End(Not run)
The function summarizes input data into sufficient statistics for estimating the attachment function and node fitness, together with additional information about the data, such as total number of nodes, number of time-steps, maximum degree, and the final degree of the network, etc. . It also provides mechanisms to automatically deal with very large datasets by binning the degree, setting a degree threshold, or grouping time-steps.
get_statistics(net_object, only_PA = FALSE , only_true_deg_matrix = FALSE , binning = TRUE , g = 50 , deg_threshold = 0 , compress_mode = 0 , compress_ratio = 0.5 , custom_time = NULL)
get_statistics(net_object, only_PA = FALSE , only_true_deg_matrix = FALSE , binning = TRUE , g = 50 , deg_threshold = 0 , compress_mode = 0 , compress_ratio = 0.5 , custom_time = NULL)
The parameters can be divided into four groups. The first group specifies input data and how the data will be summarized:
net_object |
An object of class |
only_PA |
Logical. Indicates whether only the statistics for estimating |
only_true_deg_matrix |
Logical. Return only the true degree matrix (without binning), and no other statistics is returned. The result cannot be used in |
Second group of parameters specifies how to bin the degrees:
binning |
Logical. Indicates whether the degree should be binned together. Default value is |
g |
Positive integer. Number of bins. Should be at least |
Third group contains a single parameter specifying how to reduce the number of node fitnesses:
deg_threshold |
Integer. We only estimate the fitnesses of nodes whose number of new edges acquired is at least |
Last group of parameters specifies how to group the time-stamps:
compress_mode |
Integer. Indicates whether the timeline should be compressed. The value of CompressMode:
Default value is |
compress_ratio |
Numeric. Indicates how much we should compress if CompressMode is |
custom_time |
Vector. Custom time stamps. This vector is a subset of the vector that contains all time-stamps. Only effective if |
An object of class PAFit_data
, which is a list. Some important fields are:
offset_tk |
A matrix where the |
n_tk |
A matrix where the |
m_tk |
A matrix where the |
sum_m_k |
A vector where the |
node_degree |
A matrix recording the degree of all nodes (that satisfy |
m_t |
A vector where the |
z_j |
A vector where the |
N |
Numeric. The number of nodes in the network |
T |
Numeric. The number of time steps |
deg_max |
Numeric. The maximum degree in the final network |
node_id |
A vector contains the id of all nodes |
final_deg |
A vector contains the final degree of all nodes (including those that do not satisfy the |
deg_thresh |
Integer. The specified degree threshold. |
f_position |
Numeric vector. The index in the |
start_deg |
Integer. The specified degree at which we start binning. |
begin_deg |
Numeric vector contains the beginning degree of each bin |
end_deg |
Numeric vector contains the ending degree of each bin |
interval_length |
Numeric vector contains the length of each bin. |
binning |
Logical. Indicates whether binning was applied or not. |
g |
Integer. Number of bins |
time_compress_mode |
Integer. The mode of time compression. |
t_compressed |
Integer. The number of time stamps actually used |
compressed_unique_time |
The time stamps that are actually used |
compress_ratio |
Numeric. |
custom_time |
Vector. The time stamps specified by user. |
Thong Pham [email protected]
For creating the needed input for this function (a PAFit_net
object), see as.PAFit_net
, from_igraph
, from_networkDynamic
, and graph_from_file
.
For the next step, see Newman
, Jeong
or only_A_estimate
for estimating the attachment function in isolation, only_F_estimate
for estimating node fitnesses in isolation, and joint_estimate
for joint estimation of the attachment function and node fitnesses.
library("PAFit") net <- generate_BA(N = 100 , m = 1) net_stats <- get_statistics(net) summary(net_stats)
library("PAFit") net <- generate_BA(N = 100 , m = 1) net_stats <- get_statistics(net) summary(net_stats)
This function reads an input file to a PAFit_net
object. Accepted formats are the edgelist format or the gml
format.
graph_from_file(file_name, format = "edgelist", type = "directed")
graph_from_file(file_name, format = "edgelist", type = "directed")
file_name |
A string indicates the file name. |
format |
String. Possible values are If To register a new node
If |
type |
String. Indicates whether the network is |
An object of class PAFit_net
containing the network.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) #graph_to_file(net, file_name = "test.gml", format = "gml") #reread <- graph_from_file(file_name = "test.gml", format = "gml")
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) #graph_to_file(net, file_name = "test.gml", format = "gml") #reread <- graph_from_file(file_name = "test.gml", format = "gml")
This function writes a graph in a PAFit_net
object to an output file. Accepted file formats are the edgelist format or the gml
format.
graph_to_file(net_object, file_name, format = "edgelist")
graph_to_file(net_object, file_name, format = "edgelist")
net_object |
An object of class |
file_name |
A string indicates the file name. |
format |
String. Possible values are If If |
The function writes directly to the output file.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) #graph_to_file(net, file_name = "test.gml", format = "gml")
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) #graph_to_file(net, file_name = "test.gml", format = "gml")
This function estimates the preferential attachment function by Jeong's method.
Jeong(net_object , net_stat = get_statistics(net_object) , T_0_start = 0 , T_0_end = round(net_stat$T * 0.75) , T_1_start = T_0_end + 1 , T_1_end = net_stat$T , interpolate = FALSE)
Jeong(net_object , net_stat = get_statistics(net_object) , T_0_start = 0 , T_0_end = round(net_stat$T * 0.75) , T_1_start = T_0_end + 1 , T_1_end = net_stat$T , interpolate = FALSE)
net_object |
an object of class |
net_stat |
An object of class |
T_0_start |
Positive integer. The starting time-step of the |
T_0_end |
Positive integer. The ending time-step of |
T_1_start |
Positive integer. The starting time-step of the |
T_1_end |
Positive integer. The ending time-step of |
interpolate |
Logical. If |
Outputs an PA_result
object which contains the estimated attachment function. In particular, it contains the following field:
k
and A
: a degree vector and the estimated PA function.
center_k
and theta
: when we perform binning, these are the centers of the bins and the estimated PA values for those bins.
g
: the number of bins used.
alpha
and ci
: alpha
is the estimated attachment exponenet (when assume
), while
ci
is the confidence interval.
loglinear_fit
: this is the fitting result when we estimate .
Thong Pham [email protected]
1. Jeong, H., Néda, Z. & Barabási, A. . Measuring preferential attachment in evolving networks. Europhysics Letters. 2003;61(61):567–572. (doi:10.1209/epl/i2003-00166-9).
See get_statistics
for how to create summerized statistics needed in this function.
See Newman
and only_A_estimate
for other methods to estimate the attachment function in isolation.
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Jeong(net, net_stats) # true function true_A <- result$center_k #plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Jeong(net, net_stats) # true function true_A <- result$center_k #plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
This function jointly estimates the attachment function and node fitnesses
. It first performs a cross-validation to select the optimal parameters
and
, then estimates
and
using that optimal pair with the full data (Ref. 2).
joint_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , mode_reg_A = 0 , ...)
joint_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , mode_reg_A = 0 , ...)
net_object |
an object of class |
net_stat |
An object of class |
p |
Numeric. This is the ratio of the number of new edges in the learning data to that of the full data. The data is then divided into two parts: learning data and testing data based on |
stop_cond |
Numeric. The iterative algorithm stops when |
mode_reg_A |
Binary. Indicates which regularization term is used for
|
... |
Other arguments to pass to the underlying algorithm. |
Outputs a Full_PAFit_result
object, which is a list containing the following fields:
cv_data
: a CV_Data
object which contains the cross-validation data. This is the testing data.
cv_result
: a CV_Result
object which contains the cross-validation result. Normally the user does not need to pay attention to this data.
estimate_result
: this is a PAFit_result
object which contains the estimated attachment function , the estimated fitnesses
and their confidence intervals. In particular, the important fields are:
ratio
: this is the selected value for the hyper-parameter .
shape
: this is the selected value for the hyper-parameter .
k
and A
: a degree vector and the estimated PA function.
var_A
: the estimated variance of .
var_logA
: the estimated variance of .
upper_A
: the upper value of the interval of two standard deviations around .
lower_A
: the lower value of the interval of two standard deviations around .
center_k
and theta
: when we perform binning, these are the centers of the bins and the estimated PA values for those bins. theta
is similar to A
but with duplicated values removed.
var_bin
: the variance of theta
. Same as var_A
but with duplicated values removed.
upper_bin
: the upper value of the interval of two standard deviations around theta
. Same as upper_A
but with duplicated values removed.
lower_bin
: the lower value of the interval of two standard deviations around theta
. Same as lower_A
but with duplicated values removed.
g
: the number of bins used.
alpha
and ci
: alpha
is the estimated attachment exponent (when assume
), while
ci
is the confidence interval.
loglinear_fit
: this is the fitting result when we estimate .
f
: the estimated node fitnesses.
var_f
: the estimated variance of .
upper_f
: the estimated upper value of the interval of two standard deviations around .
lower_f
: the estimated lower value of the interval of two standard deviations around .
objective_value
: values of the objective function over iterations in the final run with the full data.
diverge_zero
: logical value indicates whether the algorithm diverged in the final run with the full data.
contribution
: a list containing an estimate of the contributions of preferential attachment and fitness mechanisms in the growth process of the network. The calculation adapts a quantification method proposed in Section 3 of Ref. 4, which is for preferential attachment and transitivity, to preferential attachment and fitness.
PA_contribution
: an array containing the contributions of preferential attachment at each time-step
fit_contribution
: an array containing the contributions of the fitness mechanism at each time-step
mean_PA_contrib
: the average contribution of preferential attachment through the whole growth process
mean_fit_contrib
: the average contribution of the fitness mechanism through the whole growth process
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. (doi:10.1371/journal.pone.0137796).
2. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
3. Pham, T., Sheridan, P. & Shimodaira, H. (2020). PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks. Journal of Statistical Software 92 (3). (doi:10.18637/jss.v092.i03).
4. Inoue, M., Pham, T. & Shimodaira, H. (2020). Joint Estimation of Non-parametric Transitivity and Preferential Attachment Functions in Scientific Co-authorship Networks. Journal of Informetrics 14(3). (doi:10.1016/j.joi.2020.101042).
See get_statistics
for how to create summarized statistics needed in this function.
See Jeong
, Newman
and only_A_estimate
for functions to estimate the attachment function in isolation.
See only_F_estimate
for a function to estimate node fitnesses in isolation.
## Not run: library("PAFit") #### Example 1: a linear preferential attachment kernel, i.e., A_k = k ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of the distribution of node fitnesse = 5 net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 5) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- pmax(result$estimate_result$center_k,1) # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ############################################################################# #### Example 2: a non-log-linear preferential attachment kernel ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # A_k = alpha* log (max(k,1))^beta + 1, with alpha = 2, and beta = 2 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10 , mode = 3, alpha = 2, beta = 2) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- 2 * log(pmax(result$estimate_result$center_k,1))^2 + 1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ############################################################################# #### Example 3: another non-log-linear preferential attachment kernel ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # A_k = min(max(k,1),sat_at)^alpha, with alpha = 1, and sat_at = 100 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10 , mode = 2, alpha = 1, sat_at = 100) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- pmin(pmax(result$estimate_result$center_k,1),100)^1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
## Not run: library("PAFit") #### Example 1: a linear preferential attachment kernel, i.e., A_k = k ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of the distribution of node fitnesse = 5 net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 5) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- pmax(result$estimate_result$center_k,1) # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ############################################################################# #### Example 2: a non-log-linear preferential attachment kernel ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # A_k = alpha* log (max(k,1))^beta + 1, with alpha = 2, and beta = 2 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10 , mode = 3, alpha = 2, beta = 2) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- 2 * log(pmax(result$estimate_result$center_k,1))^2 + 1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ############################################################################# #### Example 3: another non-log-linear preferential attachment kernel ############ set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # A_k = min(max(k,1),sat_at)^alpha, with alpha = 1, and sat_at = 100 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10 , mode = 2, alpha = 1, sat_at = 100) net_stats <- get_statistics(net) # Joint estimation of attachment function Ak and node fitness result <- joint_estimate(net, net_stats) summary(result) # plot the estimated attachment function true_A <- pmin(pmax(result$estimate_result$center_k,1),100)^1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
This function implements a correction proposed in [1] of the original Newman's method in [2] to estimate the preferential attachment function.
Newman(net_object , net_stat = get_statistics(net_object), start = 1 , interpolate = FALSE)
Newman(net_object , net_stat = get_statistics(net_object), start = 1 , interpolate = FALSE)
net_object |
an object of class |
net_stat |
An object of class |
start |
Positive integer. The starting time from which the method is applied. Default value is |
interpolate |
Logical. If |
Outputs an PA_result
object which contains the estimated attachment function. In particular, it contains the following field:
k
and A
: a degree vector and the estimated PA function.
center_k
and theta
: when we perform binning, these are the centers of the bins and the estimated PA values for those bins.
g
: the number of bins used.
alpha
and ci
: alpha
is the estimated attachment exponenet (when assume
), while
ci
is the mean plus/minus two-standard-deviation interval.
loglinear_fit
: this is the fitting result when we estimate .
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. (doi:10.1371/journal.pone.0137796).
2. Newman, M.. Clustering and preferential attachment in growing networks. Physical Review E. 2001;64(2):025102 (doi:10.1103/PhysRevE.64.025102).
See get_statistics
for how to create summerized statistics needed in this function.
See Jeong
, only_A_estimate
for other methods to estimate the attachment function in isolation.
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) summary(result) # true function true_A <- result$center_k #plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) summary(result) # true function true_A <- result$center_k #plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
This function estimates the attachment function by PAFit method. The method has a hyper-parameter
. It first performs a cross-validation step to select the optimal parameter
for the regularization of
, then uses that
to estimate the attachment function with the full data.
only_A_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , mode_reg_A = 0 , MLE = FALSE , ...)
only_A_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , mode_reg_A = 0 , MLE = FALSE , ...)
net_object |
an object of class |
net_stat |
An object of class |
p |
Numeric. This is the ratio of the number of new edges in the learning data to that of the full data. The data is then divided into two parts: learning data and testing data based on |
stop_cond |
Numeric. The iterative algorithm stops when |
mode_reg_A |
Binary. Indicates which regularization term is used for
|
MLE |
Logical. If |
... |
Other arguments to pass to the underlying algorithm. |
Outputs a Full_PAFit_result
object, which is a list containing the following fields:
cv_data
: a CV_Data
object which contains the cross-validation data. This is the final Normally the user does not need to pay attention to this data. NULL
if MLE = TRUE
.
cv_result
: a CV_Result
object which contains the cross-validation result. Normally the user does not need to pay attention to this data. NULL
if MLE = TRUE
.
estimate_result
: this is a PAFit_result
object which contains the estimated PA function and its confidence interval. It also includes the estimated attachment exponenent (assuming the model
) in the field
alpha
, and the confidence interval of (in the field
ci
) when possible. In particular, the important fields are:
ratio
: this is the selected value for the hyper-parameter .
k
and A
: a degree vector and the estimated PA function.
var_A
: the estimated variance of .
var_logA
: the estimated variance of .
upper_A
: the upper value of the interval of two standard deviations around .
lower_A
: the lower value of the interval of two standard deviations around .
center_k
and theta
: when we perform binning, these are the centers of the bins and the estimated PA values for those bins. theta
is similar to A
but with duplicated values removed.
var_bin
: the variance of theta
. Same as var_A
but with duplicated values removed.
upper_bin
: the upper value of the interval of two standard deviations around theta
. Same as upper_A
but with duplicated values removed.
lower_lower
: the lower value of the interval of two standard deviations around theta
. Same as lower_A
but with duplicated values removed.
g
: the number of bins used.
alpha
and ci
: alpha
is the estimated attachment exponenet (when assume
), while
ci
is the confidence interval.
loglinear_fit
: this is the fitting result when we estimate .
objective_value
: values of the objective function over iterations in the final run with the full data.
diverge_zero
: logical value indicates whether the algorithm diverged in the final run with the full data.
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. (doi:10.1371/journal.pone.0137796).
2. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
See get_statistics
for how to create summerized statistics needed in this function.
See Newman
and Jeong
for other methods to estimate the attachment function in isolation.
## Not run: library("PAFit") set.seed(1) #### Example 1: Linear preferential attachment ######### # a network from BA model net <- generate_net(N = 1000 , m = 50 , mode = 1, alpha = 1, s = 0) net_stats <- get_statistics(net, only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- result$estimate_result$center_k lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #### Example 2: a non-log-linear preferential attachment ######### # A_k = alpha* log (max(k,1))^beta + 1, with alpha = 2, and beta = 2 set.seed(1) net <- generate_net(N = 1000 , m = 50 , mode = 3, alpha = 2, beta = 2, s = 0) net_stats <- get_statistics(net,only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- 2 * log(pmax(result$estimate_result$center_k,1))^2 + 1 # true function lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") ############################################################################# #### Example 3: another non-log-linear preferential attachment kernel ############ set.seed(1) # A_k = min(max(k,1),sat_at)^alpha, with alpha = 1, and sat_at = 200 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , mode = 2, alpha = 1, sat_at = 200, s = 0) net_stats <- get_statistics(net, only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function true_A <- pmin(pmax(result$estimate_result$center_k,1),200)^1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") ## End(Not run)
## Not run: library("PAFit") set.seed(1) #### Example 1: Linear preferential attachment ######### # a network from BA model net <- generate_net(N = 1000 , m = 50 , mode = 1, alpha = 1, s = 0) net_stats <- get_statistics(net, only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- result$estimate_result$center_k lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #### Example 2: a non-log-linear preferential attachment ######### # A_k = alpha* log (max(k,1))^beta + 1, with alpha = 2, and beta = 2 set.seed(1) net <- generate_net(N = 1000 , m = 50 , mode = 3, alpha = 2, beta = 2, s = 0) net_stats <- get_statistics(net,only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function plot(result, net_stats) # true function true_A <- 2 * log(pmax(result$estimate_result$center_k,1))^2 + 1 # true function lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") ############################################################################# #### Example 3: another non-log-linear preferential attachment kernel ############ set.seed(1) # A_k = min(max(k,1),sat_at)^alpha, with alpha = 1, and sat_at = 200 # inverse variance of the distribution of node fitnesse = 10 net <- generate_net(N = 1000 , m = 50 , mode = 2, alpha = 1, sat_at = 200, s = 0) net_stats <- get_statistics(net, only_PA = TRUE) result <- only_A_estimate(net, net_stats) # plot the estimated attachment function true_A <- pmin(pmax(result$estimate_result$center_k,1),200)^1 # true function plot(result , net_stats, max_A = max(true_A,result$estimate_result$theta)) lines(result$estimate_result$center_k, true_A, col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") ## End(Not run)
This function estimates node fitnesses assusming either
(i.e. linear preferential attachment) or
(i.e. no preferential attachment). The method has a hyper-parameter
. It first performs a cross-validation to select the optimal parameter
for the prior of
, then estimates
with the full data (Ref. 1).
only_F_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , model_A = "Linear" , ...)
only_F_estimate(net_object , net_stat = get_statistics(net_object), p = 0.75 , stop_cond = 10^-8 , model_A = "Linear" , ...)
net_object |
an object of class |
net_stat |
An object of class |
p |
Numeric. This is the ratio of the number of new edges in the learning data to that of the full data. The data is then divided into two parts: learning data and testing data based on |
stop_cond |
Numeric. The iterative algorithm stops when |
model_A |
String. Indicates which attachment function
|
... |
Other arguments to pass to the underlying algorithm. |
Outputs a Full_PAFit_result
object, which is a list containing the following fields:
cv_data
: a CV_Data
object which contains the cross-validation data. Normally the user does not need to pay attention to this data.
cv_result
: a CV_Result
object which contains the cross-validation result. Normally the user does not need to pay attention to this data.
estimate_result
: this is a PAFit_result
object which contains the estimated node fitnesses and their confidence intervals. In particular, the important fields are:
shape
: this is the selected value for the hyper-parameter .
g
: the number of bins used.
f
: the estimated node fitnesses.
var_f
: the estimated variance of .
upper_f
: the estimated upper value of the interval of two standard deviations around .
lower_f
: the estimated lower value of the interval of two standard deviations around .
objective_value
: values of the objective function over iterations in the final run with the full data.
diverge_zero
: logical value indicates whether the algorithm diverged in the final run with the full data.
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
2. Bianconni, G. & Barabási, A. (2001). Competition and multiscaling in evolving networks. Europhys. Lett., 54, 436 (doi:10.1209/epl/i2001-00260-6).
3. Caldarelli, G., Capocci, A. , De Los Rios, P. & Muñoz, M.A. (2002). Scale-Free Networks from Varying Vertex Intrinsic Fitness. Phys. Rev. Lett., 89, 258702 (doi:10.1103/PhysRevLett.89.258702).
See get_statistics
for how to create summerized statistics needed in this function.
See joint_estimate
for the method to jointly estimate the attachment function and node fitnesses
.
## Not run: library("PAFit") set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of the distribution of node fitnesse = 10 net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) # estimate node fitnesses in isolation, assuming Ak = k result <- only_F_estimate(net, net_stats) # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
## Not run: library("PAFit") set.seed(1) # size of initial network = 100 # number of new nodes at each time-step = 100 # Ak = k; inverse variance of the distribution of node fitnesse = 10 net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) # estimate node fitnesses in isolation, assuming Ak = k result <- only_F_estimate(net, net_stats) # plot the estimated node fitnesses and true node fitnesses plot(result, net_stats, true = net$fitness, plot = "true_f") ## End(Not run)
This function estimates the attachment function from one snapshot.
PAFit_oneshot(net_object, M = 10, S = 5, loop = 5, G = 1000)
PAFit_oneshot(net_object, M = 10, S = 5, loop = 5, G = 1000)
net_object |
an object of class |
M |
Integer. Number of simulated networks in each iteration. Default is |
S |
Integer. Number of iterations inside each loop. Default is |
loop |
Integer. Number of loops of the whole process. Default is |
G |
Integer. Number of bins for the PA function. Default is |
Outputs a PAFit_result
object.
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2021). Non-parametric estimation of the preferential attachment function from one network snapshot. Journal of Complex Networks 9(5): cnab024. (doi:10.1093/comnet/cnab024).
## Not run: library("PAFit") net_1 <- generate_BA(N = 10000, alpha = 1) # true attachment exponent = 1.0 result_1 <- PAFit_oneshot(net_1) print(result_1) net_2 <- generate_BA(N = 10000, alpha = 0.5) # true attachment exponent = 0.5 result_2 <- PAFit_oneshot(net_2) print(result_2) ## End(Not run)
## Not run: library("PAFit") net_1 <- generate_BA(N = 10000, alpha = 1) # true attachment exponent = 1.0 result_1 <- PAFit_oneshot(net_1) print(result_1) net_2 <- generate_BA(N = 10000, alpha = 0.5) # true attachment exponent = 0.5 result_2 <- PAFit_oneshot(net_2) print(result_2) ## End(Not run)
This function extracts from a Simulated_Data_From_Fitted_Model
object contributions of rich-get-richer and fit-get-richer effects calculated using simulated networks and plots these contributions versus the contributions calculated from the original observed network. See joint_estimate
for a description of how the contributions are calculated.
plot_contribution(simulated_object, original_result, which_plot = "PA", y_label = ifelse("PA" == which_plot, "Contribution of the rich-get-richer effect", "Contribution of the fit-get-richer effect"), legend_pos_x = 0.75, legend_pos_y = 0.9)
plot_contribution(simulated_object, original_result, which_plot = "PA", y_label = ifelse("PA" == which_plot, "Contribution of the rich-get-richer effect", "Contribution of the fit-get-richer effect"), legend_pos_x = 0.75, legend_pos_y = 0.9)
simulated_object |
an object of class |
original_result |
an object of class |
which_plot |
String. “PA": plots contributions of rich-get-richer effect, “fit": plots contribution of fit-get-richer effect. Default is “PA". |
y_label |
String. The label for y-axis. Default is "Contribution of rich-get-richer effect". |
legend_pos_x |
Numeric. The horizontal position, between (0,1), of the legend. Default value is |
legend_pos_y |
Numeric. The vertical position, between (0,1), of the legend. Default value is |
Output a plot.
Thong Pham [email protected]
1. Pham, T., Sheridan, P. & Shimodaira, H. (2015). PAFit: A Statistical Method for Measuring Preferential Attachment in Temporal Complex Networks. PLoS ONE 10(9): e0137796. (doi:10.1371/journal.pone.0137796).
2. Pham, T., Sheridan, P. & Shimodaira, H. (2016). Joint Estimation of Preferential Attachment and Node Fitness in Growing Complex Networks. Scientific Reports 6, Article number: 32558. (doi:10.1038/srep32558).
3. Pham, T., Sheridan, P. & Shimodaira, H. (2020). PAFit: An R Package for the Non-Parametric Estimation of Preferential Attachment and Node Fitness in Temporal Complex Networks. Journal of Statistical Software 92 (3). (doi:10.18637/jss.v092.i03).
4. Inoue, M., Pham, T. & Shimodaira, H. (2020). Joint Estimation of Non-parametric Transitivity and Preferential Attachment Functions in Scientific Co-authorship Networks. Journal of Informetrics 14(3). (doi:10.1016/j.joi.2020.101042).
joint_estimate
, plot_contribution
## Not run: library("PAFit") net_object <- generate_net(N = 500, m = 10, s = 10, alpha = 0.5) net_stat <- get_statistics(net_object) result <- joint_estimate(net_object, net_stat) simulated_data <- generate_simulated_data_from_estimated_model(net_object, net_stat, result) plot_contribution(simulated_data, result, which_plot = "PA") plot_contribution(simulated_data, result, which_plot = "fit") ## End(Not run)
## Not run: library("PAFit") net_object <- generate_net(N = 500, m = 10, s = 10, alpha = 0.5) net_stat <- get_statistics(net_object) result <- joint_estimate(net_object, net_stat) simulated_data <- generate_simulated_data_from_estimated_model(net_object, net_stat, result) plot_contribution(simulated_data, result, which_plot = "PA") plot_contribution(simulated_data, result, which_plot = "fit") ## End(Not run)
This function plots the estimated attachment function and node fitness
, together with additional information such as their confidence intervals or the estimated attachment exponent (
when assuming
).
## S3 method for class 'Full_PAFit_result' plot(x, net_stat , true_f = NULL , plot = "A" , plot_bin = TRUE , line = FALSE , confidence = TRUE , high_deg_A = 1 , high_deg_f = 5 , shade_point = 0.5 , col_point = "grey25" , pch = 16 , shade_interval = 0.5 , col_interval = "lightsteelblue" , label_x = NULL , label_y = NULL , max_A = NULL , min_A = NULL , f_min = NULL , f_max = NULL , plot_true_degree = FALSE , ...)
## S3 method for class 'Full_PAFit_result' plot(x, net_stat , true_f = NULL , plot = "A" , plot_bin = TRUE , line = FALSE , confidence = TRUE , high_deg_A = 1 , high_deg_f = 5 , shade_point = 0.5 , col_point = "grey25" , pch = 16 , shade_interval = 0.5 , col_interval = "lightsteelblue" , label_x = NULL , label_y = NULL , max_A = NULL , min_A = NULL , f_min = NULL , f_max = NULL , plot_true_degree = FALSE , ...)
x |
An object of class |
net_stat |
An object of class |
true_f |
Vector. Optional parameter for the true value of node fitnesses (only available in simulated datasets). If this parameter is specified and |
plot |
String. Indicates which plot is produced.
Default value is |
plot_bin |
Logical. If |
line |
Logical. Indicates whether to plot the line fitted from the log-linear model or not. Default value is |
confidence |
Logical. Indicates whether to plot the confidence intervals of |
high_deg_A |
Integer. The estimated PA function is plotted starting from |
high_deg_f |
Integer. If |
col_point |
String. The name of the color of the points. Default value is |
shade_point |
Numeric. Value between 0 and 1. This is the transparency level of the points. Default value is |
pch |
Numeric. The plot symbol. Default value is |
shade_interval |
Numeric. Value between 0 and 1. This is the transparency level of the confidence intervals. Default value is |
max_A |
Numeric. Specify the maximum of the axis of PA. |
min_A |
Numeric. Specify the minimum of the axis of PA. |
f_min |
Numeric. Specify the minimum of the axis of fitness. |
f_max |
Numeric. Specify the maximum of the axis of fitness. |
plot_true_degree |
Logical. The degree of each node is plotted or not. |
label_x |
String. The label of x-axis. |
label_y |
String. The label of y-axis. |
col_interval |
String. The name of the color of the confidence intervals. Default value is |
... |
Other arguments to pass to the underlying plotting function. |
Outputs the desired plot.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) #plot A plot(result , net_stats , plot = "A") true_A <- c(1,result$estimate_result$center_k[-1]) lines(result$estimate_result$center_k + 1 , true_A , col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot true_f plot(result, net_stats , net$fitness, plot = "true_f") ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) #plot A plot(result , net_stats , plot = "A") true_A <- c(1,result$estimate_result$center_k[-1]) lines(result$estimate_result$center_k + 1 , true_A , col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot true_f plot(result, net_stats , net$fitness, plot = "true_f") ## End(Not run)
This function plots the estimated attachment function from the corrected Newman's method or the Jeong's method. Its also plots additional information such as the estimated attachment exponenent ( when assuming
).
## S3 method for class 'PA_result' plot(x, net_stat = NULL, plot_bin = TRUE , high_deg = 1 , line = FALSE , col_point = "black", shade_point = 0.5 , pch = 16 , max_A = NULL , min_A = NULL , label_x = NULL , label_y = NULL , ...)
## S3 method for class 'PA_result' plot(x, net_stat = NULL, plot_bin = TRUE , high_deg = 1 , line = FALSE , col_point = "black", shade_point = 0.5 , pch = 16 , max_A = NULL , min_A = NULL , label_x = NULL , label_y = NULL , ...)
x |
An object of class |
net_stat |
An object of class |
plot_bin |
Logical. If |
high_deg |
Integer. Specifies the starting degree from which |
line |
Logical. Indicates whether to plot the line fitted from the log-linear model or not. Default value is |
col_point |
String. The name of the color of the points. Default value is |
shade_point |
Numeric. Value between |
pch |
Numeric. The plot symbol. Default value is |
max_A |
Numeric. Specify the maximum of the horizontal axis. |
min_A |
Numeric. Specify the minimum of the horizontal axis. |
label_x |
String. The label of x-axis. If |
label_y |
String. The label of y-axis. If |
... |
Other arguments to pass to the underlying plotting function. |
Outputs the desired plot.
Thong Pham [email protected]
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) # true function true_A <- result$center_k # plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true attachment function legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) # true function true_A <- result$center_k # plot the estimated attachment function plot(result , net_stats) lines(result$center_k, true_A, col = "red") # true attachment function legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n")
PAFit_net
object
This function plots a PAFit_net
object. There are four options of plot
to specify the type of plot.
The first two concern plotting the graph in $graph
of the PAFit_net
object. Option plot = "graph"
plots the graph, while plot = "degree"
plots the degree distribution. Option slice
allows selection of the time-step at which the temporal graph is plotted.
The last two options concern plotting the PA function and node fitnesses (if they are not NULL
).
## S3 method for class 'PAFit_net' plot(x, plot = "graph" , slice = length(unique(x$graph[,3])) - 1, ...)
## S3 method for class 'PAFit_net' plot(x, plot = "graph" , slice = length(unique(x$graph[,3])) - 1, ...)
x |
An object of class |
plot |
String. Possible values are |
slice |
Integer. Ignored when |
... |
Other arguments to pass to the underlying plotting function. |
Outputs the desired plot.
Thong Pham [email protected]. When plot = "graph"
, the function uses plot.network.default
in the network package.
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) plot(net, plot = "graph") plot(net, plot = "degree") plot(net, plot = "PA") plot(net, plot = "fit")
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) plot(net, plot = "graph") plot(net, plot = "degree") plot(net, plot = "PA") plot(net, plot = "fit")
PAFit_result
object
This function plots the estimated attachment function and node fitness
, together with additional information such as their confidence intervals or the estimated attachment exponent (
when assuming
) of a
PAFit_result
object. This object is stored in the field $estimate_result
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'PAFit_result' plot(x, net_stat = NULL , true_f = NULL , plot = "A" , plot_bin = TRUE , line = FALSE , confidence = TRUE , high_deg_A = 1 , high_deg_f = 5 , shade_point = 0.5 , col_point = "grey25" , pch = 16 , shade_interval = 0.5 , col_interval = "lightsteelblue" , label_x = NULL , label_y = NULL , max_A = NULL , min_A = NULL , f_min = NULL , f_max = NULL , plot_true_degree = FALSE , ...)
## S3 method for class 'PAFit_result' plot(x, net_stat = NULL , true_f = NULL , plot = "A" , plot_bin = TRUE , line = FALSE , confidence = TRUE , high_deg_A = 1 , high_deg_f = 5 , shade_point = 0.5 , col_point = "grey25" , pch = 16 , shade_interval = 0.5 , col_interval = "lightsteelblue" , label_x = NULL , label_y = NULL , max_A = NULL , min_A = NULL , f_min = NULL , f_max = NULL , plot_true_degree = FALSE , ...)
x |
An object of class |
net_stat |
An object of class |
true_f |
Vector. Optional parameter for the true value of node fitnesses (only available in simulated datasets). If this parameter is specified and |
plot |
String. Indicates which plot is produced.
Default value is |
plot_bin |
Logical. If |
line |
Logical. Indicates whether to plot the line fitted from the log-linear model or not. Default value is |
confidence |
Logical. Indicates whether to plot the confidence intervals of |
high_deg_A |
Integer. The estimated PA function is plotted starting from |
high_deg_f |
Integer. If |
col_point |
String. The name of the color of the points. Default value is |
shade_point |
Numeric. Value between 0 and 1. This is the transparency level of the points. Default value is |
pch |
Numeric. The plot symbol. Default value is |
shade_interval |
Numeric. Value between 0 and 1. This is the transparency level of the confidence intervals. Default value is |
max_A |
Numeric. Specify the maximum of the axis of PA. |
min_A |
Numeric. Specify the minimum of the axis of PA. |
f_min |
Numeric. Specify the minimum of the axis of fitness. |
f_max |
Numeric. Specify the maximum of the axis of fitness. |
plot_true_degree |
Logical. The degree of each node is plotted or not. |
label_x |
String. The label of x-axis. |
label_y |
String. The label of y-axis. |
col_interval |
String. The name of the color of the confidence intervals. Default value is |
... |
Other arguments to pass to the underlying plotting function. |
Outputs the desired plot.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) #plot A plot(result$estimate_result , net_stats , plot = "A") true_A <- c(1,result$estimate_result$center_k[-1]) lines(result$estimate_result$center_k + 1 , true_A , col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot true_f plot(result, net_stats , net$fitness, plot = "true_f") ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) #plot A plot(result$estimate_result , net_stats , plot = "A") true_A <- c(1,result$estimate_result$center_k[-1]) lines(result$estimate_result$center_k + 1 , true_A , col = "red") # true line legend("topleft" , legend = "True function" , col = "red" , lty = 1 , bty = "n") #plot true_f plot(result, net_stats , net$fitness, plot = "true_f") ## End(Not run)
This function prints simple information of the cross-validation data stored in a CV_Data
object. This object is the field $cv_data
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'CV_Data' print(x,...)
## S3 method for class 'CV_Data' print(x,...)
x |
An object of class |
... |
Other arguments to pass. |
Prints simple information of the cross-validation data.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$cv_data) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$cv_data) ## End(Not run)
This function prints simple information of the cross-validation result stored in a CV_Result
object. This object is the field $cv_result
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'CV_Result' print(x,...)
## S3 method for class 'CV_Result' print(x,...)
x |
An object of class |
... |
Other arguments to pass. |
Prints simple information of the cross-validation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$cv_result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$cv_result) ## End(Not run)
This function outputs simple information of the estimation result.
## S3 method for class 'Full_PAFit_result' print(x,...)
## S3 method for class 'Full_PAFit_result' print(x,...)
x |
An object of class |
... |
Other arguments to pass. |
Outputs summary information on the estimation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result) ## End(Not run)
This function outputs simple information of the estimated attachment function from the corrected Newman's method or the Jeong's method.
## S3 method for class 'PA_result' print(x, ...)
## S3 method for class 'PA_result' print(x, ...)
x |
An object of class |
... |
Additional parameters to pass. |
Simple information of the estimated attachment function.
Thong Pham [email protected]
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) print(result)
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) print(result)
PAFit_data
object
This function prints simple information of the statistics stored in a PAFit_data
object. This object is the returning value of get_statistics
.
## S3 method for class 'PAFit_data' print(x,...)
## S3 method for class 'PAFit_data' print(x,...)
x |
An object of class |
... |
Other arguments to pass. |
Prints simple information of the network statistics.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) print(net_stats) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) print(net_stats) ## End(Not run)
PAFit_net
object
This function outputs simple information of a PAFit_net
object.
## S3 method for class 'PAFit_net' print(x, ...)
## S3 method for class 'PAFit_net' print(x, ...)
x |
An object of class |
... |
Other arguments to pass. |
Outputs simple information of the network.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) print(net)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) print(net)
PAFit_result
object
This function outputs simple information of the estimation result stored in a PAFit_result
object. This object is stored in the field $estimate_result
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'PAFit_result' print(x,...)
## S3 method for class 'PAFit_result' print(x,...)
x |
An object of class |
... |
Other arguments to pass. |
Outputs summary information on the estimation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$estimate_result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) print(result$estimate_result) ## End(Not run)
This function outputs summary information of the cross-validation data stored in a CV_Data
object. This object is the field $cv_data
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'CV_Data' summary(object,...)
## S3 method for class 'CV_Data' summary(object,...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information of the cross-validation data.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$cv_data) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$cv_data) ## End(Not run)
This function outputs summary information of the cross-validation result stored in a CV_Result
object. This object is the field $cv_result
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'CV_Result' summary(object,...)
## S3 method for class 'CV_Result' summary(object,...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information of the cross-validation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$cv_result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$cv_result) ## End(Not run)
This function outputs a summary on the estimation result.
## S3 method for class 'Full_PAFit_result' summary(object,...)
## S3 method for class 'Full_PAFit_result' summary(object,...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information on the estimation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result) ## End(Not run)
This function outputs summary information of the estimated attachment function from the corrected Newman's method or the Jeong's method.
## S3 method for class 'PA_result' summary(object, ...)
## S3 method for class 'PA_result' summary(object, ...)
object |
An object of class |
... |
Additional parameters to pass. |
Summary information of the estimated attachment function.
Thong Pham [email protected]
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) summary(result)
library("PAFit") net <- generate_net(N = 1000 , m = 1 , mode = 1 , alpha = 1 , s = 0) net_stats <- get_statistics(net) result <- Newman(net, net_stats) summary(result)
PAFit_data
object
This function outputs summary information of the statistics stored in a PAFit_data
object. This object is the returning value of get_statistics
.
## S3 method for class 'PAFit_data' summary(object,...)
## S3 method for class 'PAFit_data' summary(object,...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information of the network statistics.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) summary(net_stats) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) summary(net_stats) ## End(Not run)
PAFit_net
object
This function outputs summary information of a PAFit_net
object.
## S3 method for class 'PAFit_net' summary(object, ...)
## S3 method for class 'PAFit_net' summary(object, ...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information of the network.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) summary(net)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) summary(net)
PAFit_result
object
This function outputs summary information of the estimation result stored in a PAFit_result
object. This object is stored in the field $estimate_result
of a Full_PAFit_result
object, which in turn is the returning value of only_A_estimate
, only_F_estimate
or joint_estimate
.
## S3 method for class 'PAFit_result' summary(object,...)
## S3 method for class 'PAFit_result' summary(object,...)
object |
An object of class |
... |
Other arguments to pass. |
Outputs summary information on the estimation result.
Thong Pham [email protected]
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$estimate_result) ## End(Not run)
## Since the runtime is long, we do not let this example run on CRAN ## Not run: library("PAFit") set.seed(1) # a network from Bianconi-Barabasi model net <- generate_BB(N = 1000 , m = 50 , num_seed = 100 , multiple_node = 100, s = 10) net_stats <- get_statistics(net) result <- joint_estimate(net, net_stats) summary(result$estimate_result) ## End(Not run)
This function implements the method in Handcock and Jones (2004) to fit various distributions to a degree vector. The implemented distributions are Yule, Waring, Poisson, geometric and negative binomial. The Yule and Waring distributions correspond to a preferential attachment situation. In particular, the two distributions correspond to the case of for
and
for all
(note that, the number of new edges and new nodes at each time-step are implicitly assumed to be
).
Thus, if the best fitted distribution, which is chosen by either the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC), is NOT Yule or Waring, then the case of for
and
for all
is NOT consistent with the observed degree vector.
The method allows the low-tail probabilities to NOT follow the parametric distribution, i.e., for all
and
for all
. Here
is the degree threshold above which the parametric distribution holds,
are probabilities of the low-tail,
is the parametric distribution with parameter vector
.
For fixed and
,
and
can be estimated by Maximum Likelihood Estimation. We can choose the best
for each
by comparing the AIC (or BIC). More details can be founded in Handcock and Jones (2004).
test_linear_PA(degree_vector)
test_linear_PA(degree_vector)
degree_vector |
a degree vector |
Outputs a Linear_PA_test_result
object which contains the fitting of five distributions to the degree vector: Yule (yule
), Waring (waring
), Poisson (pois
), geometric (geom
) and negative binomial (nb
). In particular, for each distribution, the AIC and BIC are calcualted for each .
Thong Pham [email protected]
1. Handcock MS, Jones JH (2004). “Likelihood-based inference for stochastic models of sexual network formation.” Theoretical Population Biology, 65(4), 413 – 422. ISSN 0040-5809. doi:10.1016/j.tpb.2003.09.006. Demography in the 21st Century, https://www.sciencedirect.com/science/article/pii/S0040580904000310.
## Not run: library("PAFit") set.seed(1) net <- generate_BA(n = 1000) stats <- get_statistics(net, only_PA = TRUE) u <- test_linear_PA(stats$final_deg) print(u) ## End(Not run)
## Not run: library("PAFit") set.seed(1) net <- generate_BA(n = 1000) stats <- get_statistics(net, only_PA = TRUE) u <- test_linear_PA(stats$final_deg) print(u) ## End(Not run)
This function converts a PAFit_net
object to an igraph
object (of package igraph).
to_igraph(net_object)
to_igraph(net_object)
net_object |
An object of class |
The function returns an igraph
object.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) igraph_graph <- to_igraph(net)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) igraph_graph <- to_igraph(net)
This function converts a PAFit_net
object to a networkDynamic
object (of package networkDynamic).
to_networkDynamic(net_object)
to_networkDynamic(net_object)
net_object |
An object of class |
The function returns a networkDynamic
object.
Thong Pham [email protected]
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) nD_graph <- to_networkDynamic(net)
library("PAFit") # a network from Bianconi-Barabasi model net <- generate_BB(N = 50 , m = 10 , s = 10) nD_graph <- to_networkDynamic(net)