Package 'NCA' reference manual

Title:	Necessary Condition Analysis
Description:	Performs a Necessary Condition Analysis (NCA). (Dul, J. 2016. Necessary Condition Analysis (NCA). ''Logic and Methodology of 'Necessary but not Sufficient' causality." Organizational Research Methods 19(1), 10-52) <doi:10.1177/1094428115584005>. NCA identifies necessary (but not sufficient) conditions in datasets, where x causes (e.g. precedes) y. Instead of drawing a regression line ''through the middle of the data'' in an xy-plot, NCA draws the ceiling line. The ceiling line y = f(x) separates the area with observations from the area without observations. (Nearly) all observations are below the ceiling line: y <= f(x). The empty zone is in the upper left hand corner of the xy-plot (with the convention that the x-axis is ''horizontal'' and the y-axis is ''vertical'' and that values increase ''upwards'' and ''to the right''). The ceiling line is a (piecewise) linear non-decreasing line: a linear step function or a straight line. It indicates which level of x (e.g. an effort or input) is necessary but not sufficient for a (desired) level of y (e.g. good performance or output). A quick start guide for using this package can be found here: <https://repub.eur.nl/pub/78323/> or <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2624981>.
Authors:	Jan Dul [aut], Govert Buijs [cre]
Maintainer:	Govert Buijs <[email protected]>
License:	GPL (>= 3)
Version:	4.0.2
Built:	2025-02-08 03:52:26 UTC
Source:	https://github.com/cran/NCA

Necessary Condition Analysis

Description

The NCA package implements Necessary Condition Analysis (NCA) as developed by Dul (2016). For running the NCA package a data file (e.g., mydata.csv, which contains the input data) must be available. An example data file (presented in above article) is included in the package. The user must load the data and call the nca function.

Details

Package:	NCA
Type:	Package
Version:	4.0.2
Date:	2024-11-08
License:	GPL (>= 3)

Author(s)

Author: Jan Dul [email protected]
Maintainer: Govert Buijs [email protected]

References

Dul, J. 2016. Necessary Condition Analysis (NCA).Logic and methodology of 'necessary but not sufficient' causality. Organizational Research Methods 19(1), 10-52. doi:10.1177/1094428115584005
Dul, J. (2020). Conducting Necessary Condition Analysis. Sage publishers. ISBN: 9781526460141. https://uk.sagepub.com/en-gb/eur/conducting-necessary-condition-analysis-for-business-and-management-students/book262898
Dul, J., van der Laan, E., & Kuik, R. (2020). A statistical significance test for Necessary Condition Analysis. Organizational Research Methods, 23(2), 385-395. doi:10.1177/1094428118795272

Examples

# A more detailed guide can be found here : https://repub.eur.nl/pub/78323/
# or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2624981

# Load data from a CSV file with header and row names:
data <- read.csv('mydata.csv', row.names=1)
# Or load and rename the example dataset
data(nca.example)
data <- nca.example

# Run NCA with the dataset and name the analysis 'model'.
# Specify the independent (cause) and dependent (effect) variables by column index or name
# More than 1 independent variables can be specified with a vector
model <- nca_analysis(data, c(1, 2), 3)

# A quick summary of the analysis can be displayed by 'model'
model

# A full summary of the analysis is shown by nca_output (see documentation for more options)
nca_output(model)

# The results of the analysis is a list of 6 items :
# - plots (1 for each independent variable)
# - summaries (1 for each independent variable)
# - bottleneck tables (1 for each ceiling technique)
# - peers (1 dataframe for each independent variable)
# - tests (1 list for each independent variabl)
# - test.time (total time to run all tests)
names(model)

# The first item contains the graphical outputs for each independent variable
# This is not really useful to humans
model$plots[[1]]

# The seconds item contains a list with the summaries for the independent variables
model$summaries[[1]]

# The third item contains a list with the bottleneck tables, one for each ceiling technique
model$bottlenecks$cr_fdh

# The fourth item shows the peers, for each independent variable
model$peers$Individualism

# For the fifth and sixth item, the test.rep needs to be larger than 0
# for performing the statistical test
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1, 2), 3, test.rep=100)

# The fifth item shows the tests for each independent variable
# This is not really useful to humans
model$tests$Individualism

# The last item shows the total time needed to perform the analysis.
# For large values of test.rep the test may take long.  
model$test.time
# A more detailed guide can be found here : https://repub.eur.nl/pub/78323/
# or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2624981

# Load data from a CSV file with header and row names:
data <- read.csv('mydata.csv', row.names=1)
# Or load and rename the example dataset
data(nca.example)
data <- nca.example

# Run NCA with the dataset and name the analysis 'model'.
# Specify the independent (cause) and dependent (effect) variables by column index or name
# More than 1 independent variables can be specified with a vector
model <- nca_analysis(data, c(1, 2), 3)

# A quick summary of the analysis can be displayed by 'model'
model

# A full summary of the analysis is shown by nca_output (see documentation for more options)
nca_output(model)

# The results of the analysis is a list of 6 items :
# - plots (1 for each independent variable)
# - summaries (1 for each independent variable)
# - bottleneck tables (1 for each ceiling technique)
# - peers (1 dataframe for each independent variable)
# - tests (1 list for each independent variabl)
# - test.time (total time to run all tests)
names(model)

# The first item contains the graphical outputs for each independent variable
# This is not really useful to humans
model$plots[[1]]

# The seconds item contains a list with the summaries for the independent variables
model$summaries[[1]]

# The third item contains a list with the bottleneck tables, one for each ceiling technique
model$bottlenecks$cr_fdh

# The fourth item shows the peers, for each independent variable
model$peers$Individualism

# For the fifth and sixth item, the test.rep needs to be larger than 0
# for performing the statistical test
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1, 2), 3, test.rep=100)

# The fifth item shows the tests for each independent variable
# This is not really useful to humans
model$tests$Individualism

# The last item shows the total time needed to perform the analysis.
# For large values of test.rep the test may take long.  
model$test.time

a set of all available ceiling techniques

Description

Ceilings to use for the nca or nca_analysis methods
> nca(data, c(1, 2), 3, ceilings=c('ols', 'ce_fdh', 'cr_fdh'))

Note that the ols regression line is not a ceiling but is included as a reference.

Ceiling Technique	Name
cols	Corrected Ordinary Least Squares
qr	Quantile Regression
ce_vrs	Ceiling Envelopment with Varying Return to Scale
cr_vrs	Ceiling Regression with Varying Return to Scale
ce_fdh	Ceiling Envelopment with Free Disposal Hull
cr_fdh	Ceiling Regression with Free Disposal Hull
c_lp	Ceiling Linear Programming

Note: The SFA and LH ceiling lines are deprecated (discontinued) from version 3.2.0

a set defining the line colors for the plots

Description

Set before calling nca_output
> line.colors['ce_fdh'] <- 'blue'

Reset one line color by setting it to NULL
> line.colors['ce_fdh'] <- NULL

Reset all line colors by setting line.colors to NULL
> line.colors <- NULL

Format

This is a list with default line colors for each ceiling technique

ols	'green'	c_lp	'blue'
cols	'darkgreen'	qr	'lightpink'
ce_vrs	'orchid4'	cr_vrs	'violet'
ce_fdh	'red'	cr_fdh	'orange'

a set defining the line types for the plots

Description

Set before calling nca_output
> line.types['ce_fdh'] <- 1

Reset one line type by setting it to NULL
> line.types['ce_fdh'] <- NULL

Reset all line types by setting line.types to NULL
> line.types <- NULL

Format

This is a list with default line types for each ceiling technique

ols	1	c_lp	2
cols	3	qr	4
ce_vrs	5	cr_vrs	1
ce_fdh	6	cr_fdh	1

parameter defining the width of the lines in the plots

Description

This will be used for the lwd parameter of the plot, default is 1.5.
Set before calling nca_output
> line.width <- 5

Run a basic NCA analyses on a data set

Description

Run a basic NCA analyses on a data set

Usage

nca(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'))nca(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'))

Arguments

`data`	dataframe with columns of the variables
`x`	collection of the columns with the independent variables
`y`	index or name of the column with the dependent variable
`ceilings`	vector with the ceiling techniques to include in this analysis

Value

Returns a list with 3 items (see examples for further explanation):

`plots`	A list of plot-data for each x-y combination
`summaries`	A list of dataframes with the summaries for each x-y combination
`bottlenecks`	A list of dataframes with a bottleneck table for each ceiling technique

Examples

# Load the data
data(nca.example)
data <- nca.example

# Basic NCA analysis
# Independent variables in the first 2 columns, dependent variable in the third column
# This shows scatter plot(s) with the ceiling lines and the effect size(s) on the console
nca(data, c(1, 2), 3)

# Columns can be selected by name as well
nca(data, c('Individualism', 'Risk taking'), 'Innovation performance')

# Define the ceiling techniques via the ceilings parameter
nca(data, c(1, 2), 3, ceilings=c('ols', 'ce_vrs'))
# These are the available ceiling techniques
print(ceilings)
# Load the data
data(nca.example)
data <- nca.example

# Basic NCA analysis
# Independent variables in the first 2 columns, dependent variable in the third column
# This shows scatter plot(s) with the ceiling lines and the effect size(s) on the console
nca(data, c(1, 2), 3)

# Columns can be selected by name as well
nca(data, c('Individualism', 'Risk taking'), 'Innovation performance')

# Define the ceiling techniques via the ceilings parameter
nca(data, c(1, 2), 3, ceilings=c('ols', 'ce_vrs'))
# These are the available ceiling techniques
print(ceilings)

Run NCA analyses on a data set

Description

Run multiple types of NCA analyses on a dataset

Usage

nca_analysis(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'),
             corner=NULL, flip.x=FALSE, flip.y=FALSE, scope=NULL,
             bottleneck.x='percentage.range', bottleneck.y='percentage.range',
             steps=10, step.size=NULL, cutoff=0, qr.tau=0.95,
             effect_aggregation = 1, test.rep=0,
             test.p_confidence=0.95, test.p_threshold=0.05)nca_analysis(data, x, y, ceilings=c('ols', 'ce_fdh', 'cr_fdh'),
             corner=NULL, flip.x=FALSE, flip.y=FALSE, scope=NULL,
             bottleneck.x='percentage.range', bottleneck.y='percentage.range',
             steps=10, step.size=NULL, cutoff=0, qr.tau=0.95,
             effect_aggregation = 1, test.rep=0,
             test.p_confidence=0.95, test.p_threshold=0.05)

Arguments

`data`	dataframe with columns of the variables
`x`	index or name (or a vector of those) with independent variable(s) x
`y`	index or name of the column with the dependent variable y
`ceilings`	vector with the ceiling techniques to include in this analysis
`corner`	either an integer or a vector of integers, indicating the corner to analyze, see Details
`flip.x`	reverse the direction of the independent variables Use either a boolean for all independent variables, or a vector with the same length as x
`flip.y`	reverse the direction of the dependent variables, boolean
`scope`	a theoretical scope in list format : (x.low, x.high, y.low, y.high), see Details
`bottleneck.x`	options for displaying the independent variables in the bottleneck table 'percentage.range' to display the percentage of range between min(x) and max(x) 'percentage.max' to display the percentage of max(x) 'actual' to display the actual values 'percentile' to display the percentiles Using percentage.max with negative values might yield counterintuitive results.
`bottleneck.y`	options for displaying the dependent variables in the bottleneck table. See bottleneck.x
`steps`	this argument accepts 2 types : - an integer (number of steps in the bottleneck table) - a list of values (used as Y values in the bottleneck table). Only useful if step.size is not defined.
`step.size`	define the step size in the bottleneck table. The user will be warned if the stepsize does not fit the Y range. Defaults to null for using steps.
`cutoff`	display calculated x,y values that are lower/higher than lowest/highest observed x,y values in the bottleneck table as: 0 : NN (not necessary) and NA (not available) 1 : NN (not necessary) and highest observed values 2 : calculated values
`qr.tau`	define the qr tau (between 0 and 1) for the quantile regression ceiling technique, default 0.95
`effect_aggregation`	define the corners to aggregate into the effect size. 1 is upper-left and is always selected, 2 is upper-right, 3 is lower-left and 4 is lower-right
`test.rep`	number of resamples in the statistical approximate permutation test. For test.rep = 0 no statistical test is performed
`test.p_confidence`	confidence level of the estimated p-value. Is used to calculate the p-accuracy for a given number of resamples (test.rep), default 0.95.
`test.p_threshold`	define the threshold significance level in the returned plot of the statistical test, default 0.05

Details

Corners
Corner 1 is the upper-left corner and corner 2 is the upper-right corner. These two corners are used for an analysis of the necessity of the presence/high level if x (corner = 1 ) or the absence/low level if x (corner = 2) for the presence/high level of y, respectively.
Corner 3 is the lower-left corner and corner 4 is the lower-right corner. These two corners are used for an analysis of the necessity of the presence/high level of x (corner = 3 ) or the absence/low level if x (corner = 4) for the absence/low level of y, respectively.
By default the upper left corner is analysed for all independent variables and corner is not defined. If corner is defined, flip.x and flip.y are ignored.

Scope
By default, the theoretical scope is not defined and the empirical scope is used based on the minimum and maximum observed values of x and y.

Value

Returns a list of 6 items (see examples for further explanation):

`plots`	A list of plot-data for each x-y combination
`summaries`	A list of dataframes with the summaries for each x-y combination
`bottlenecks`	A list of dataframes with a bottleneck table for each ceiling technique
`peers`	A list of ceilings, with a list of peers for each independent variable. Peers are corner points of the CE-FDH ceiling line (e.g., the northwest-corners points for corner = 1)
`tests`	The results of the test for each independent variable (not human friendly, use nca_output)
`test.time`	The total time needed to run the tests for all independent variables

Examples

# Load the data
data(nca.example)
data <- nca.example

# Basic NCA analysis, with independent variables in the first 2 columns
# and the dependent variable in the third column
model <- nca_analysis(data, c(1, 2), 3)

# Use nca_output to show the summaries (see nca_output documentation for more options)
nca_output(model)

# Columns can be selected by name as well
model <- nca_analysis(data, c('Individualism', 'Risk taking'), 'Innovation performance')

# Define the ceiling techniques via the ceilings parameter, see 'ceilings' for all types
model <- nca_analysis(data, c(1, 2), 3, ceilings=c('ce_fdh', 'ce_vrs'))

# These are the available ceiling techniques
print(ceilings)

# By default the upper-left corner is analysed. With the corner argument for each
# independent variable a different corner can be selected. Select corner 1 or 2
# for an analysis of necessary conditions for the presence/high level of the
# dependent variable, and corner 3 or 4 for an analysis of necessary conditions for
# the absence/low level of the dependent variable. It is not possible to combine
# corner 1 or 2 with corner 3 or 4 in the same analysis as different outcomes are analysed.
# This analyses the upper right corner for the first independent variable
# and the upper left corner for the second independent variable:
model <- nca_analysis(data, c(1, 2), 3, corner=c(2, 1))

# Alternatively, for using the upper right corner(s), 'flip' the x variables
model <- nca_analysis(data, c(1, 2), 3, flip.x=TRUE)

# It is also possible to flip a single x variable
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE))

# Flip the y variable if the lower corners need analysing
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE), flip.y=TRUE)

# Use a theoretical scope instead of the (calculated) empirical scope
model <- nca_analysis(data, c(1, 2), 3, scope=c(0, 120, 0, 240))

# Display the peers for a ceiling and an independent variable
print(model$peers$ce_fdh$Individualism)

# By default, the bottleneck tables use percentages of the range for the x and y values.
# Using the percentage of the max value is also possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentage.max')

# Use the actual values, in this case the x-value
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual')

# Use percentile, in this case for the y-values
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentile')

# Any combination is possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual', bottleneck.y='percentile')

# The number of steps is adjustible via the steps parameter
model <- nca_analysis(data, c(1, 2), 3, steps=20)

# The steps parameter also accepts a list of values
# These are interpreted as actual or percentage / percentile depending on bottleneck.y
model <- nca_analysis(data, c(1, 2), 3, steps=seq(50, 120, 10))

# Or via the step.size parameter, this ignores the steps parameter
model <- nca_analysis(data, c(1, 2), 3, step.size=5)

# If the ceiling line crosses the X = Xmax line at a point C below Y = Ymax,
# for Y < Yc < Ymax, the corresponding X in the bottleneck table is displayed as 'NA'
# It is also possible to display them as Xmax
model <- nca_analysis(data, c(1, 2), 3, cutoff=1)

# or as the calculated value on the ceiling line
model <- nca_analysis(data, c(1, 2), 3, cutoff=2)

# To run tests, the test.rep needs to be larger than 0
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1), 3, test.rep=1000, test.p_confidence=0.9, test.p_threshold=0.05)

# The output of the tests can be shown via nca_output with test=TRUE
nca_output(model, test=TRUE)
# Load the data
data(nca.example)
data <- nca.example

# Basic NCA analysis, with independent variables in the first 2 columns
# and the dependent variable in the third column
model <- nca_analysis(data, c(1, 2), 3)

# Use nca_output to show the summaries (see nca_output documentation for more options)
nca_output(model)

# Columns can be selected by name as well
model <- nca_analysis(data, c('Individualism', 'Risk taking'), 'Innovation performance')

# Define the ceiling techniques via the ceilings parameter, see 'ceilings' for all types
model <- nca_analysis(data, c(1, 2), 3, ceilings=c('ce_fdh', 'ce_vrs'))

# These are the available ceiling techniques
print(ceilings)

# By default the upper-left corner is analysed. With the corner argument for each
# independent variable a different corner can be selected. Select corner 1 or 2
# for an analysis of necessary conditions for the presence/high level of the
# dependent variable, and corner 3 or 4 for an analysis of necessary conditions for
# the absence/low level of the dependent variable. It is not possible to combine
# corner 1 or 2 with corner 3 or 4 in the same analysis as different outcomes are analysed.
# This analyses the upper right corner for the first independent variable
# and the upper left corner for the second independent variable:
model <- nca_analysis(data, c(1, 2), 3, corner=c(2, 1))

# Alternatively, for using the upper right corner(s), 'flip' the x variables
model <- nca_analysis(data, c(1, 2), 3, flip.x=TRUE)

# It is also possible to flip a single x variable
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE))

# Flip the y variable if the lower corners need analysing
model <- nca_analysis(data, c(1, 2), 3, flip.x=c(TRUE, FALSE), flip.y=TRUE)

# Use a theoretical scope instead of the (calculated) empirical scope
model <- nca_analysis(data, c(1, 2), 3, scope=c(0, 120, 0, 240))

# Display the peers for a ceiling and an independent variable
print(model$peers$ce_fdh$Individualism)

# By default, the bottleneck tables use percentages of the range for the x and y values.
# Using the percentage of the max value is also possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentage.max')

# Use the actual values, in this case the x-value
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual')

# Use percentile, in this case for the y-values
model <- nca_analysis(data, c(1, 2), 3, bottleneck.y='percentile')

# Any combination is possible
model <- nca_analysis(data, c(1, 2), 3, bottleneck.x='actual', bottleneck.y='percentile')

# The number of steps is adjustible via the steps parameter
model <- nca_analysis(data, c(1, 2), 3, steps=20)

# The steps parameter also accepts a list of values
# These are interpreted as actual or percentage / percentile depending on bottleneck.y
model <- nca_analysis(data, c(1, 2), 3, steps=seq(50, 120, 10))

# Or via the step.size parameter, this ignores the steps parameter
model <- nca_analysis(data, c(1, 2), 3, step.size=5)

# If the ceiling line crosses the X = Xmax line at a point C below Y = Ymax,
# for Y < Yc < Ymax, the corresponding X in the bottleneck table is displayed as 'NA'
# It is also possible to display them as Xmax
model <- nca_analysis(data, c(1, 2), 3, cutoff=1)

# or as the calculated value on the ceiling line
model <- nca_analysis(data, c(1, 2), 3, cutoff=2)

# To run tests, the test.rep needs to be larger than 0
# Optionally the p_confidence (default 0.95) and the p_threshold (default 0) can be set
model <- nca_analysis(data, c(1), 3, test.rep=1000, test.p_confidence=0.9, test.p_threshold=0.05)

# The output of the tests can be shown via nca_output with test=TRUE
nca_output(model, test=TRUE)

Outlier detection

Description

Detect outliers on the dataset.

Usage

nca_outliers(data, x, y, ceiling = NULL,
corner = NULL, flip.x = FALSE, flip.y = FALSE, scope=NULL,
k = 1, min.dif = 1e-2, max.results = 25, plotly=FALSE, condensed = FALSE)nca_outliers(data, x, y, ceiling = NULL,
corner = NULL, flip.x = FALSE, flip.y = FALSE, scope=NULL,
k = 1, min.dif = 1e-2, max.results = 25, plotly=FALSE, condensed = FALSE)

Arguments

`data`	Dataframe with columns of the variables
`x`	Index or name of the column with the independent variable
`y`	Index or name of the column with the dependent variable
`ceiling`	Name of the ceiling technique to be used. If not provided, the default ceilings (CE_FDH) will be used
`corner`	either an integer or a vector of integers, indicating the corner to analyze, see Details
`flip.x`	reverse the direction of the independent variables Use either a boolean for all independent variables, or a vector with the same length as x
`flip.y`	reverse the direction of the dependent variables, boolean
`scope`	a theoretical scope in list format : (x.low, x.high, y.low, y.high), see nca_analysis
`k`	use combinations of observations, default is 1 (single observations)
`min.dif`	set the threshold for the minimum dif.rel to be considered as outlier, default is 1e-2
`max.results`	only show the first 'max.results' outliers, default is 25
`plotly`	If true shows the interactive scatter plot(s), one for each independent variable. In RStudio the plots are shown in the Viewer window. See Details.
`condensed`	If true and k > 1, hide outlier combinations for which the effect size is not increased compared to the effect size of single elements of that combination. (default FALSE)

Details

Outliers
The potential outliers are displayed with the original effects size and the effect size if this outlier is removed. The absolute and relative differences between both effect sizes is also shown. The table also displays if a point is a ceiling zone outlier or a scope outlier.

Plotly
The plot highlights the potential outliers. The names, relative difference and XY coordinates of all points pop up when moving the pointer over the plot. The toolbar allows several actions such as zoom and selection of parts of the plot.

Examples

# A basic example of the nca_outliers command:
data(nca.example)
outliers <- nca_outliers(nca.example, 1, 3)

# This prints the outlier table
print(outliers)

# Plotly displays a scatterplot with the outliers
nca_outliers(nca.example, 1, 3, plotly = TRUE)

# Test for combinations of observations
# Useful to detect clusters of observations as possible outliers
nca_outliers(nca.example, 1, 3, k = 2)

# Just like the nca_analysis command, nca_outliers accept both flip and corner arguments
nca_outliers(nca.example, 1, 3, corner=3)

# It is possible to define the maximum number of results (default is 25)
nca_outliers(nca.example, 1, 3, max.results=5)

# Do no show possible outliers where the abs(dif.rel) is smaller than min.dif
nca_outliers(nca.example, 1, 3, min.dif=10)

# If k > 1, the effect size of a single observation might not change
# when paired with another observation, e.g. dif.rel of Obs1 == dif.rel of Obs1+Obs2.
# The example below hides combinations of Japan with Portugal, Greece, etc.
nca_outliers(nca.example, 1, 3, k = 2, condensed = TRUE)
# A basic example of the nca_outliers command:
data(nca.example)
outliers <- nca_outliers(nca.example, 1, 3)

# This prints the outlier table
print(outliers)

# Plotly displays a scatterplot with the outliers
nca_outliers(nca.example, 1, 3, plotly = TRUE)

# Test for combinations of observations
# Useful to detect clusters of observations as possible outliers
nca_outliers(nca.example, 1, 3, k = 2)

# Just like the nca_analysis command, nca_outliers accept both flip and corner arguments
nca_outliers(nca.example, 1, 3, corner=3)

# It is possible to define the maximum number of results (default is 25)
nca_outliers(nca.example, 1, 3, max.results=5)

# Do no show possible outliers where the abs(dif.rel) is smaller than min.dif
nca_outliers(nca.example, 1, 3, min.dif=10)

# If k > 1, the effect size of a single observation might not change
# when paired with another observation, e.g. dif.rel of Obs1 == dif.rel of Obs1+Obs2.
# The example below hides combinations of Japan with Portugal, Greece, etc.
nca_outliers(nca.example, 1, 3, k = 2, condensed = TRUE)

display the result of the NCA analysis

Description

Show the plots, NCA summaries and bottleneck tables of a NCA analysis.

Usage

nca_output(model, plots=TRUE, plotly=FALSE, bottlenecks=FALSE,
           summaries=TRUE, test=FALSE, pdf=FALSE, path=NULL, selection = NULL)nca_output(model, plots=TRUE, plotly=FALSE, bottlenecks=FALSE,
           summaries=TRUE, test=FALSE, pdf=FALSE, path=NULL, selection = NULL)

Arguments

`model`	Displays output of the nca or nca_analysis command
`plots`	If true (default) show the scatter plot(s), one for each independent variable. In Rstudio the plots are shown the Plots window.
`plotly`	If true shows the interactive scatter plot(s), one for each independent variable. In RStudio the plots are shown in the Viewer window. See Details.
`bottlenecks`	If true displays the bottleneck table(s) in the Console window, one table for each ceiling line
`summaries`	If true shows the summaries for each independent variable in the Console window, see Details
`test`	If true shows the result of the statistical sigificance test (if present), see Details
`pdf`	If true exports the output to a pdf file, except for the plotly plot
`path`	Optional path for the output file(s)
`selection`	Optionally selects the independent variables for inclusion in the output. For example, only the plots for the selected independent variable are shown. In the bottleneck table the required levels of only the selected independent variables for given levels of Y are shown.

Details

Plotly
The plot highlights the points that construct the ceiling line ('peers'). The names and XY coordinates of all points pop up when moving the pointer over the plot. The toolbar allows several actions such as zoom and selection of parts of the plot. Optionally subgroups of points can be labeled.

Summaries
The output starts with 6 lines of basic information ("global") about the dataset ("Number of observations", "Scope", "Xmin", "Xmax", "Ymin", and "Ymax"). "Scope" refers to the empirical area of possible x-y combinations, given the minimum and maximum observed x and y values.
The next 11 lines present the NCA parameters ("param", see below) for each of the selected ceiling techniques (the defaults techniques are ce_fdh and cr_fdh).

The 11 printed NCA parameters are:
- Ceiling zone, which is the size of the "empty" area in the upper-left corner
- Effect size, which is the ceiling zone divided by the scope
- # above, which is the number of observations that are above the ceiling line and hence in the "empty" ceiling zone
- c-accuracy, which is the number of observations on or below the ceiling line divided by the total number of observations and multiplied by 100 percent
- Fit, which relates to the "closeness" of the selected ceiling line to the ce_fdh ceiling line
- Slope and Intercept, which are the slope and the intercept of the straight ceiling line (no values are printed if the ceiling line is not a straight line, but a step function)
- Abs. ineff., which is the total xy-space where x does not constrain y, and y is not constrained by x
- Rel. ineff., which is the total xy-space where x does not constrain y, and y is not constrained by x as percentage of the scope
- Condition ineff., which is the condition inefficiency that indicates for which range of x (as a percentage of the total range) x does not constrain y (i.e., there is no ceiling line in that x-range)
- Outcome ineff., which is the outcome efficiency that indicates for which range of y (as a percentage of the total range of y) y is not constrained by x (i.e., there is no ceiling line in that y-range)

Test
NCA's statistical test is a randomness test to evaluate if the observed effect size may be a random result of unrelated X and Y variables

Examples

# Use the result of the nca command:
data(nca.example)
data <- nca.example
model <- nca_analysis(data, c(1, 2), 3)

# Show the full summaries in the Console window
nca_output(model)

# Suppress the summaries and display the plots
nca_output(model, plots=TRUE, summaries=FALSE)

# Display the plots via Plotly
nca_output(model, plotly=TRUE, summaries=FALSE)

# Label the observation of the Plotly plot by using a vector of names (no more than 5).
# For example label the observations in nca.example
labels <- c('Australia', 'Europe', 'Europe', 'North America', 'Europe', 'Europe',
'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Asia',
'North America', 'Europe', 'Australia', 'Europe', 'Europe', 'Europe', 'Europe',
'Asia', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'North America')
nca_output(model, plotly=labels, summaries=FALSE)

# Suppress the summaries and display the bottlenecks
nca_output(model, bottlenecks=TRUE, summaries=FALSE)

# Show the results of the statistical significance test (p-value)
# Make sure to set test.rep in nca_analysis
nca_output(model, test=TRUE)

# Show all five
nca_output(model, plots=TRUE, plotly=TRUE, bottlenecks=TRUE, test=TRUE)

# Per independent variable, export plots and summaries to PDF files,
# and export all the bottleneck tables to a single PDF file
nca_output(model, plots=TRUE, bottlenecks=TRUE, pdf=TRUE)

# Use the path option to export to an existing directory
outdir <- '/tmp' 
nca_output(model, plots=TRUE, pdf=TRUE, path=outdir)

# Limit the output to a selection of independent variables by name
nca_output(model, plots=TRUE, selection=c("Individualism"))

# Or by column index, in both cases the order matters
nca_output(model, plots=TRUE, selection=c(2, 1))
# Use the result of the nca command:
data(nca.example)
data <- nca.example
model <- nca_analysis(data, c(1, 2), 3)

# Show the full summaries in the Console window
nca_output(model)

# Suppress the summaries and display the plots
nca_output(model, plots=TRUE, summaries=FALSE)

# Display the plots via Plotly
nca_output(model, plotly=TRUE, summaries=FALSE)

# Label the observation of the Plotly plot by using a vector of names (no more than 5).
# For example label the observations in nca.example
labels <- c('Australia', 'Europe', 'Europe', 'North America', 'Europe', 'Europe',
'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'Asia',
'North America', 'Europe', 'Australia', 'Europe', 'Europe', 'Europe', 'Europe',
'Asia', 'Europe', 'Europe', 'Europe', 'Europe', 'Europe', 'North America')
nca_output(model, plotly=labels, summaries=FALSE)

# Suppress the summaries and display the bottlenecks
nca_output(model, bottlenecks=TRUE, summaries=FALSE)

# Show the results of the statistical significance test (p-value)
# Make sure to set test.rep in nca_analysis
nca_output(model, test=TRUE)

# Show all five
nca_output(model, plots=TRUE, plotly=TRUE, bottlenecks=TRUE, test=TRUE)

# Per independent variable, export plots and summaries to PDF files,
# and export all the bottleneck tables to a single PDF file
nca_output(model, plots=TRUE, bottlenecks=TRUE, pdf=TRUE)

# Use the path option to export to an existing directory
outdir <- '/tmp' 
nca_output(model, plots=TRUE, pdf=TRUE, path=outdir)

# Limit the output to a selection of independent variables by name
nca_output(model, plots=TRUE, selection=c("Individualism"))

# Or by column index, in both cases the order matters
nca_output(model, plots=TRUE, selection=c(2, 1))

Function to evaluate power

Description

Function to evaluate power, test if a sample size is large enough to detect necessity.

Usage

nca_power(n = c(20, 50, 100), effect = 0.10, slope = 1, ceiling = "ce_fdh",
p = 0.05, distribution.x = "uniform", distribution.y = "uniform", rep = 100,
test.rep = 200)nca_power(n = c(20, 50, 100), effect = 0.10, slope = 1, ceiling = "ce_fdh",
p = 0.05, distribution.x = "uniform", distribution.y = "uniform", rep = 100,
test.rep = 200)

Arguments

`n`	Number of datapoints to generate, either an integer or a vector of integers.
`effect`	Effect size of the generated datasets.
`slope`	Slope of the line.
`ceiling`	Ceiling technique to use for this analysis
`p`	Targeted confidence level
`distribution.x`	Distribution type(s) for X, "uniform" (default) or "normal".
`distribution.y`	Distribution type(s) for Y, "uniform" (default) or "normal".
`rep`	Number of analyses done per iteration.
`test.rep`	Number of resamples in the statistical approximate permutation test. For test.rep = 0 no statistical test is performed

Examples

# Simple example
## Not run: results <- nca_power()

print(results)
# Simple example
## Not run: results <- nca_power()

print(results)

generating random data that meets necessity

Description

Generate N datapoints, with 'normal' or 'uniform' distributions for X and Y

Usage

nca_random(n, intercepts, slopes, corner=1,
                  distribution.x = "uniform", distribution.y = "uniform",
                  mean.x = 0.5, mean.y = 0.5, sd.x = 0.2, sd.y = 0.2)nca_random(n, intercepts, slopes, corner=1,
                  distribution.x = "uniform", distribution.y = "uniform",
                  mean.x = 0.5, mean.y = 0.5, sd.x = 0.2, sd.y = 0.2)

Arguments

`n`	Number of observations to generate, should be an integer > 1.
`intercepts`	The intercept or a vector of intercepts of the line.
`slopes`	The slope or a vector if slopes of the line.
`corner`	Define which corner should be empty, default is 1 (upper left).
`distribution.x`	Type of the distribution for X, "uniform" (default) or "normal". The latter is a truncated normal distribution.
`distribution.y`	Type of the distribution for Y, "uniform" (default) or "normal". The latter is a truncated normal distribution.
`mean.x`	Distribution Mean of X (default 0.5), ignored distribution.x == "uniform".
`mean.y`	Distribution Mean of Y (default 0.5), ignored distribution.y == "uniform".
`sd.x`	Distribution SD of X (default 0.2), ignored distribution.x == "uniform".
`sd.y`	Distribution SD of Y (default 0.2), ignored distribution.y == "uniform".

Examples

# Generate a uniform dataset, default for X and Y
data <- nca_random(100, 0, 1)

# It is also possible to generate a dataset with multiple independent variables,
# by supplying vectors for the intercepts and slopes
data <- nca_random(100, c(0, 0.25), c(1, 0.75))
# Single values will be repeated to complement a vector  
data <- nca_random(100, c(0, 0.25), 1)

# The default is an empty space in the upper left corner.
# A different corner can be selected with the corner argument
data <- nca_random(100, 0, 1, corner=4)

# Generate a dataset with a normal distribution for X and a uniform distribution for Y
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "uniform")

# Generate a dataset with a normal distribution for X and Y, with adjusted MEAN
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "normal",
                   mean.x = 0.75, mean.y = 0.75)

# Generate a dataset with a normal distribution for X and Y, with adjusted SD
data <- nca_random(100, 0, 1, distribution.x = "normal",
                   distribution.y = "normal", sd.x = 0.1, sd.y = 0.1)
# Generate a uniform dataset, default for X and Y
data <- nca_random(100, 0, 1)

# It is also possible to generate a dataset with multiple independent variables,
# by supplying vectors for the intercepts and slopes
data <- nca_random(100, c(0, 0.25), c(1, 0.75))
# Single values will be repeated to complement a vector  
data <- nca_random(100, c(0, 0.25), 1)

# The default is an empty space in the upper left corner.
# A different corner can be selected with the corner argument
data <- nca_random(100, 0, 1, corner=4)

# Generate a dataset with a normal distribution for X and a uniform distribution for Y
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "uniform")

# Generate a dataset with a normal distribution for X and Y, with adjusted MEAN
data <- nca_random(100, 0, 1, distribution.x = "normal", distribution.y = "normal",
                   mean.x = 0.75, mean.y = 0.75)

# Generate a dataset with a normal distribution for X and Y, with adjusted SD
data <- nca_random(100, 0, 1, distribution.x = "normal",
                   distribution.y = "normal", sd.x = 0.1, sd.y = 0.1)

NCA example data with 2 independent and 1 dependent variables

Description

This data set has Individualism and Risk taking as independent variables, and Innovation performance as the dependent variable for 28 countries.

Usage

data(nca.example)data(nca.example)

Format

A matrix containing 28 observations, incl. headers and row names.

NCA example data with 3 independent and 1 dependent variables

Description

This data set has Contractual detail, Goodwill trust, and Competence trust as independent variables, and Innovation as dependent variable for 48 buyer-supplier relationships. See Table 2 in: Van der Valk, W., Sumo, R., Dul, J., & Schroeder, R. G. (2016). When are contracts and trust necessary for innovation in buyer-supplier relationships? A necessary condition analysis. Journal of Purchasing and Supply Management, 22(4), 266-277.

Usage

data(nca.example2)data(nca.example2)

Format

A matrix containing 48 observations, incl. headers.

parameter defining the point color in the plots

Description

This will be used for the col parameters of the plots, default is blue.
Set before calling nca_output
> point.color <- red

parameter defining the plotting symbol in the plots

Description

This will be used for the pch parameter of the plots, default is 21.
Set before calling nca_output
> point.type <- 22

See http://www.sthda.com/english/wiki/r-plot-pch-symbols-the-different-point-shapes-available-in-r for more symbols

Package 'NCA'

Help Index

Necessary Condition Analysis

Description

Details

Author(s)

References

See Also

Examples

a set of all available ceiling techniques

Description

a set defining the line colors for the plots

Description

Format

a set defining the line types for the plots

Description

Format

parameter defining the width of the lines in the plots

Description

Run a basic NCA analyses on a data set

Description

Usage

Arguments

Value

Examples

Run NCA analyses on a data set

Description

Usage

Arguments

Details

Value

Examples

Outlier detection

Description

Usage

Arguments

Details

Examples

display the result of the NCA analysis

Description

Usage

Arguments

Details

Examples

Function to evaluate power

Description

Usage

Arguments

Examples

generating random data that meets necessity

Description

Usage

Arguments

Examples

NCA example data with 2 independent and 1 dependent variables

Description

Usage

Format

NCA example data with 3 independent and 1 dependent variables

Description

Usage

Format

parameter defining the point color in the plots

Description

parameter defining the plotting symbol in the plots

Description