# 7. Indexing Into Vectors¶

Given a vector of data one common task is to isolate particular entries or censor items that meet some criteria. Here we show how to use R’s indexing notation to pick out specific items within a vector.

## 7.1. Indexing With Logicals¶

We first give an example of how to select specific items in a
vector. The first step is to define a vector of data, and the second
step is to define a vector made up of logical values. When the vector
of logical values is used for the index into the vector of data values
only the items corresponding to the variables that evaluate to *TRUE*
are returned:

```
> a <- c(1,2,3,4,5)
> b <- c(TRUE,FALSE,FALSE,TRUE,FALSE)
> a[b]
[1] 1 4
> max(a[b])
[1] 4
> sum(a[b])
[1] 5
```

## 7.2. Not Available or Missing Values¶

One common problem is data entries that are marked *NA* or not
available. There is a predefined variable called *NA* that can be used
to indicate missing information. The problem with this is that some
functions throw an error if one of the entries in the data is NA. Some
functions allow you to ignore the missing values through special
options:

```
> a <- c(1,2,3,4,NA)
> a
[1] 1 2 3 4 NA
> sum(a)
[1] NA
> sum(a,na.rm=TRUE)
[1] 10
```

There are other times, though, when this option is not available, or
you simply want to censor them. The *is.na* function can be used to
determine which items are not available. The logical “not” operator in
R is the *!* symbol. When used with the indexing notation the items
within a vector that are *NA* can be easily removed:

```
> a <- c(1,2,3,4,NA)
> is.na(a)
[1] FALSE FALSE FALSE FALSE TRUE
> !is.na(a)
[1] TRUE TRUE TRUE TRUE FALSE
> a[!is.na(a)]
[1] 1 2 3 4
> b <- a[!is.na(a)]
> b
[1] 1 2 3 4
```

## 7.3. Indices With Logical Expression¶

Any logical expression can be used as an index which opens a wide range of possibilities. For example, you can remove or focus on entries that match specific criteria. For example, you might want to remove all entries that are above a certain value:

```
> a = c(6,2,5,3,8,2)
> a
[1] 6 2 5 3 8 2
> b = a[a<6]
> b
[1] 2 5 3 2
```

For another example, suppose you want to join together the values that match two different factors in another vector:

```
> d = data.frame(one=as.factor(c('a','a','b','b','c','c')),
two=c(1,2,3,4,5,6))
> d
one two
1 a 1
2 a 2
3 b 3
4 b 4
5 c 5
6 c 6
> both = d$two[(d$one=='a') | (d$one=='b')]
> both
[1] 1 2 3 4
```

Note that a single ‘|’ was used in the previous example. There is a difference between ‘||’ and ‘|’. A single bar will perform a vector operation, term by term, while a double bar will evaluate to a single TRUE or FALSE result:

```
> (c(TRUE,TRUE))|(c(FALSE,TRUE))
[1] TRUE TRUE
> (c(TRUE,TRUE))||(c(FALSE,TRUE))
[1] TRUE
> (c(TRUE,TRUE))&(c(FALSE,TRUE))
[1] FALSE TRUE
> (c(TRUE,TRUE))&&(c(FALSE,TRUE))
[1] FALSE
```