# Problem D

Perceptron Learning Algorithm

Implement the Perceptron Learning Algorithm (PLA) in to find a hypothesis in $d$ dimensions:

\[ h(x) = \textrm{sign}(w_0 + w_1 x_1 + w_2 x_2 + \ldots + w_ d x_ d). \]In other words, given a set of example pairs $(x, y)$, where $x \in \mathbb {R}^ d$ and $y \in \{ -1,+1\} $, use the PLA to identify a set of good weights $w_0$ through $w_ d$ for classifying the given examples.

## Input

Input for your program is a training set of $n$ examples in $d$ dimensions.

The first line contains two integers $1 \le n \le 10\, 000$ and $1 \le d \le 10$. Following are $n$ lines, each containing $d$ real numbers and one integer. The first $d$ numbers are for values $x_1$ through $x_ d$, and the last is for the corresponding $y$ value, which is either $-1$ or $1$. Each real value has at most $5$ digits past the decimal point and is in the range $[-2 \cdot 10^9, 2 \cdot 10^9]$.

## Output

Implement the Perceptron Learning Algorithm (or the Pocket algorithm) to create a perceptron model that fits the data as well as possible. Output $d+1$ real values corresponding to the coefficients $w_0$ to $w_ d$ in the definition of $h(x)$ above.

Note that the sample output shown is actually the test data on which a model would be evaluated (see below). It is in the same format as the input. A reasonable output for the given sample input might be:

-722.0 8.0 13.0

which would give a score of $80.0$ on the test data, according to the metric below.

## Scoring

Each time your program runs, it will process a different training set. The output of your program will be used to construct a perceptron to evaluate on another set of data, called the testing set. The testing set comes from a distribution similar to the training set, but is not known to your program.

Assume that the test data has $m$ examples $\{ (x(1),y(1)), \ldots , (x(m),y(m))\} $. The score your model receives for each test case is based on the accuracy on these $m$ examples (times $100$):

\[ \frac{100}{m} \cdot \sum _{i=1}^ m [h(x(i)) = y(i)] \]where $[\alpha ]$ is the indicator function (which is $1$ if the condition inside is true and $0$ otherwise).

The overall score for your submission is the average of the scores it receives for all hidden test cases. The sample data is excluded from this average.

Sample Input 1 | Sample Output 1 |
---|---|

6 2 77 66 1 86 14 1 50 21 -1 42 82 1 28 78 1 22 32 -1 |
10 2 23 47 -1 26 2 -1 96 78 1 1 65 -1 64 40 1 13 68 1 24 5 -1 69 32 1 21 19 -1 65 88 1 |