Как удалить столбцы в r
Перейти к содержимому

Как удалить столбцы в r

  • автор:

R : Select or Remove Columns from Data Frame

The article below explains how to select or remove columns (variables) from dataframe in R. In R, there are multiple ways to select or delete a column.

The following code creates a sample data frame that is used for demonstration.

Remove Columns from Data Frame in R
Sample Data

R : Remove column by name

In base R there are multiple ways to delete columns by name.

Method I : subset() function

The most easiest way to remove columns is by using subset() function. In the code below, we are telling R to drop variables x and z. The ‘-‘ sign indicates dropping variables. Make sure the variable names would NOT be specified in quotes when using subset() function.

Method II : ! sign

In this method, we are creating a character vector named drop in which we are storing column names x and z. Later we are telling R to select all the variables except the column names specified in the vector drop. The function names() returns all the column names and the ‘!’ sign indicates negation.

R : Remove columns by column index numbers

It’s easier to remove columns by their position number. All you just need to do is to mention the column index number. In the following code, we are telling R to delete columns that are positioned at first column, third and fourth columns. The minus sign is to drop variables.

In this case, we are telling R to keep only variables that are placed at second and fourth position.

Select or Delete columns with dplyr package

Remove Columns by Name Pattern

Remove Columns by Name Pattern in R
Keep / Drop Columns by pattern

The same logic can be applied to a word as well if you wish to find out columns containing a particular word. In the example below, we are trying to keep columns where it contains C_A and creates a new dataframe for the retained columns.

The following program automates selecting or deleting columns from a data frame.

To keep variables ‘a’ and ‘x’, use the code below. The drop = 0 implies keeping variables that are specified in the parameter «cols». The parameter «data» refers to input data frame. «cols» refer to the variables you want to keep / remove. «newdata» refers to the output data frame.

To drop variables, use the code below. The drop = 1 implies removing variables which are defined in the second parameter of the function.

Как удалить столбцы из фрейма данных в R (с примерами)

Самый простой способ удалить столбцы из фрейма данных в R — использовать функцию subset() , которая использует следующий базовый синтаксис:

В следующих примерах показано, как использовать эту функцию на практике со следующим фреймом данных:

Пример 1. Удаление столбцов по имени

В следующем коде показано, как удалить столбцы из фрейма данных по имени:

Пример 2. Удаление столбцов по индексу

В следующем коде показано, как удалить столбцы из фрейма данных по индексу:

Пример 3: удаление столбцов в списке

В следующем коде показано, как удалить из фрейма данных столбцы, принадлежащие определенному списку:

Пример 4: удаление столбцов в диапазоне

В следующем коде показано, как удалить столбцы из фрейма данных в определенном диапазоне:

Remove an entire column from a data.frame in R

Does anyone know how to remove an entire column from a data.frame in R? For example if I am given this data.frame:

and I want to remove the 2nd column.

8 Answers 8

You can set it to NULL .

As pointed out in the comments, here are some other possibilities:

You can remove multiple columns via:

Be careful with matrix-subsetting though, as you can end up with a vector:

To remove one or more columns by name, when the column names are known (as opposed to being determined at run-time), I like the subset() syntax. E.g. for the data-frame

to remove just the a column you could do

and to remove the b and d columns you could do

You can remove all columns between d and b with:

As I said above, this syntax works only when the column names are known. It won’t work when say the column names are determined programmatically (i.e. assigned to a variable). I’ll reproduce this Warning from the ?subset documentation:

Warning:

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like ‘[‘, and in particular the non-standard evaluation of argument ‘subset’ can have unanticipated consequences.

How to Remove Column in R?

To remove a single column or multiple columns in R DataFrame use square bracket notation [] or use functions from third-party packages like dplyr. There are several ways to remove columns or variables from the R DataFrame (data.frame).

Please enable JavaScript

1. Prepare the Data

Let’s create an R DataFrame, run these examples and explore the output. If you already have data in CSV you can easily import CSV files to R DataFrame. Also, refer to Import Excel File into R.

2. Remove Column using R Base Functions

By using R base function subset() or square bracket notation you can remove single or multiple columns by index/name from the R DataFrame.

2.1 Remove Column by Index

First, let’s use the R base bracket notation df[] to remove the column by Index. This notation takes syntax df[, columns] to select columns in R, And to remove columns you have to use the – (negative) operator.

The following example removes the second column by Index from the R DataFrame.

2.2 Remove Columns by Range

This notation also supports selecting columns by the range and using the negative operator to remove columns by range. In the following example, removes all rows between 2 and 4 indexes, which ideally removes columns pages , names , and chapters .

2.3 Remove Multiple Columns

Use vector to specify the column/vector indexes you want to remove from the R data frame. The following example removes multiple columns with indexes 2 and 3.

2.4 Remove Columns From List

You can also use the column names from the list to remove them from the R data frame. Here I am using names() function which returns all column names and checks if a name is present in the list using %in% operator.

2.5 By using subset() Function

By using the R base function subset() you can remove columns by name from the data frame. This function takes the data frame object as an argument and the columns you wanted to remove.

Yields the same output as above.

3. Remove Columns by using dplyr Functions

In this section, I will use functions from the dplyr package to remove columns in R data frame. dplyr is an R package that provides a grammar of data manipulation and provides a most used set of verbs that helps data science analysts to solve the most common data manipulation. In order to use this, you have to install it first using install.packages(‘dplyr’) and load it using library(dplyr) .

3.1 Remove Column by Matching

dplyr select() function is used to select the column and by using negation of this to remove columns. All verbs in dplyr package take data.frame as a first argument. When we use dplyr package, we mostly use the infix operator %>% from magrittr , it passes the left-hand side of the operator to the first argument of the right-hand side of the operator.

For example, x %>% f(y) converted into f(x, y) so the result from left-hand side is then “piped” into the right-hand side. This pipe can be used to write multiple operations that you can read left-to-right.

3.2 Remove Variables By Name Range

The same function can also be used to remove variables by name range.

3.3 Remove Variables using contains

Use -contains() to ignore columns that contain text. The following example removes the column chapters as it contains text apt . This function also takes a list of values to check contains.

3.4 Remove Column starts with

Use -starts_with() to ignore columns that start with a text. The following example removes the column chapters as it starts with character c.

3.5 Remove Column ends with

Similarly, use -ends_with() to remove variables that end with a text, the following examples remove name and price columns as they end with the letter e.

3.6 Remove Columns if it exists

Finally, use the one_of() function to check if the column exists and then remove it from the data frame only when exists. If a column is not found, it returns a warning.

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *