123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257 |
- [[indexing_documents]]
- === Indexing documents
- When you add documents to {es}, you index JSON documents. This maps naturally to
- PHP associative arrays, since they can easily be encoded in JSON. Therefore, in
- Elasticsearch-PHP you create and pass associative arrays to the client for
- indexing. There are several methods of ingesting data into {es} which we cover
- here.
- [discrete]
- ==== Single document indexing
- When indexing a document, you can either provide an ID or let {es} generate one
- for you.
- {zwsp} +
- .Providing an ID value
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id',
- 'body' => [ 'testField' => 'abc']
- ];
- // Document will be indexed to my_index/_doc/my_id
- $response = $client->index($params);
- ----
- {zwsp} +
- .Omitting an ID value
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'body' => [ 'testField' => 'abc']
- ];
- // Document will be indexed to my_index/_doc/<autogenerated ID>
- $response = $client->index($params);
- ----
- {zwsp} +
- If you need to set other parameters, such as a `routing` value, you specify
- those in the array alongside the `index`, and others. For example, let's set the
- routing and timestamp of this new document:
- .Additional parameters
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id',
- 'routing' => 'company_xyz',
- 'timestamp' => strtotime("-1d"),
- 'body' => [ 'testField' => 'abc']
- ];
- $response = $client->index($params);
- ----
- {zwsp} +
- [discrete]
- ==== Bulk Indexing
- {es} also supports bulk indexing of documents. The bulk API expects JSON
- action/metadata pairs, separated by newlines. When constructing your documents
- in PHP, the process is similar. You first create an action array object (for
- example, an `index` object), then you create a document body object. This
- process repeats for all your documents.
- A simple example might look like this:
- .Bulk indexing with PHP arrays
- [source,php]
- ----
- for($i = 0; $i < 100; $i++) {
- $params['body'][] = [
- 'index' => [
- '_index' => 'my_index',
- ]
- ];
- $params['body'][] = [
- 'my_field' => 'my_value',
- 'second_field' => 'some more values'
- ];
- }
- $responses = $client->bulk($params);
- ----
- In practice, you'll likely have more documents than you want to send in a single
- bulk request. In that case, you need to batch up the requests and periodically
- send them:
- .Bulk indexing with batches
- [source,php]
- ----
- $params = ['body' => []];
- for ($i = 1; $i <= 1234567; $i++) {
- $params['body'][] = [
- 'index' => [
- '_index' => 'my_index',
- '_id' => $i
- ]
- ];
- $params['body'][] = [
- 'my_field' => 'my_value',
- 'second_field' => 'some more values'
- ];
- // Every 1000 documents stop and send the bulk request
- if ($i % 1000 == 0) {
- $responses = $client->bulk($params);
- // erase the old bulk request
- $params = ['body' => []];
- // unset the bulk response when you are done to save memory
- unset($responses);
- }
- }
- // Send the last batch if it exists
- if (!empty($params['body'])) {
- $responses = $client->bulk($params);
- }
- ----
- [[getting_documents]]
- === Getting documents
- {es} provides realtime GETs of documents. This means that as soon as the
- document is indexed and your client receives an acknowledgement, you can
- immediately retrieve the document from any shard. Get operations are performed
- by requesting a document by its full `index/type/id` path:
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id'
- ];
- // Get doc at /my_index/_doc/my_id
- $response = $client->get($params);
- ----
- {zwsp} +
- [[updating_documents]]
- === Updating documents
- Updating a document allows you to either completely replace the contents of the
- existing document, or perform a partial update to just some fields (either
- changing an existing field or adding new fields).
- [discrete]
- ==== Partial document update
- If you want to partially update a document (for example, change an existing
- field or add a new one) you can do so by specifying the `doc` in the `body`
- parameter. This merges the fields in `doc` with the existing document.
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id',
- 'body' => [
- 'doc' => [
- 'new_field' => 'abc'
- ]
- ]
- ];
- // Update doc at /my_index/_doc/my_id
- $response = $client->update($params);
- ----
- {zwsp} +
- [discrete]
- ==== Scripted document update
- Sometimes you need to perform a scripted update, such as incrementing a counter
- or appending a new value to an array. To perform a scripted update, you need to
- provide a script and usually a set of parameters:
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id',
- 'body' => [
- 'script' => 'ctx._source.counter += count',
- 'params' => [
- 'count' => 4
- ]
- ]
- ];
- $response = $client->update($params);
- ----
- {zwsp} +
- [discrete]
- ==== Upserts
- Upserts are "Update or Insert" operations. This means an upsert attempts to run
- your update script, but if the document does not exist (or the field you are
- trying to update doesn't exist), default values are inserted instead.
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id',
- 'body' => [
- 'script' => [
- 'source' => 'ctx._source.counter += params.count',
- 'params' => [
- 'count' => 4
- ],
- ],
- 'upsert' => [
- 'counter' => 1
- ],
- ]
- ];
- $response = $client->update($params);
- ----
- {zwsp} +
- [[deleting_documents]]
- === Deleting documents
- Finally, you can delete documents by specifying their full `/index/_doc_/id`
- path:
- [source,php]
- ----
- $params = [
- 'index' => 'my_index',
- 'id' => 'my_id'
- ];
- // Delete doc at /my_index/_doc_/my_id
- $response = $client->delete($params);
- ----
- {zwsp} +
|