java – SOLR 8.7 filtrar por mas de una condición en [child] childDocumentTransformer

En solr 8.7 tengo las siguientes configuraciones:

schema.xml

<schema name="default-config" version="1.6">
    <types>
        <fieldType name="text" class="solr.TextField">
            <analyzer>
                <tokenizer class="solr.ClassicTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>
        <fieldType name="string" class="solr.StrField"/>
        <fieldType name="_nest_path_" class="solr.NestPathField"/>
    </types>

    <fields>
        <field name="_root_" type="string" indexed="true" stored="false"/>
        <field name="_nest_path_" type="_nest_path_" indexed="true" stored="false"/>
        <field name="id" type="string" indexed="true" stored="true"/>

        <dynamicField name="*" type="text" indexed="true" stored="true"/>
    </fields>

    <uniqueKey>id</uniqueKey>
</schema>

Documentos a indexar:

{
  id: 1,
  object: "doc",
  items: (
    {
      id: 2,
      content: "lorem ipsum same"
    }
  )
},
{
  id: 3,
  object: "doc",
  items: (
    {
      id: 4,
      content: "hello word same"
    },
    {
      id: 5,
      content: "lorem ipsum same"
    }
  )
}

Query:

q={!parent which="object:doc AND id:*"} +(content:same AND id:4)
fl=*, (child childFilter="smc_content:same AND smc_id:4")

Resultado actual:

{
  "responseHeader":{
    "status":0,
    "QTime":3,
    "params":{
      "q":"{!parent which="object:doc AND id:*"} +(content:same AND id:4)",
      "fl":"*, (child childFilter="content:same AND id:4")",
      "_":"1612202589981"}},
  "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":(
      {
        "id":"3",
        "object":"doc",
        "items":(
          {
            "id":"4",
            "content":"hello word same"},
          
          {
            "id":"5",
            "content":"lorem ipsum same"})})
  }}

El resultado esperado es el siguiente:

{
  "id": "3",
  "object": "doc",
  "items": (
    {
      "id": "4",
      "content": "hello word same"
    }
  )
}

He probado diferentes configuraciones que he encontrado en la documentación de SOLR 8.7 y en diversos foros que sugieren que para usar multiples condiciones en el transformador (child) childDocumentTransformer se necesitan definir como variable y entre paréntesis, pero no he podido hacerlo andar. Adjunto el ejemplo de lo detallado:
introducir la descripción de la imagen aquí

views – Drupal 8 purge module to allow SOLR index replication before cache invalidation happens

I am having a Drupal 8 website running over Acquia Infra. I have recently configured Purge module to invalidation cache over Acquia varnish and Cloudflare when cache-tags gets Invalidated.

Problem: Solr index takes around 60 to 120 seconds before full replication happens, but Purge module immediately processes queue to clear cache tags over Varnish and Cloudflare, Hence the newly created cache get generated from stale or outdated content from SOLR index if immediately browsed after updating.

Is there any possibility of adding delays (eg 2000ms) before processing Purge queue?

Any possible solution would be appreciated.

indexing – SOLR Admin Console is not responding normally

I recently upgraded SOLR from version 8.4.1 to 8.6.3. But, after creating the core, and indexing the same (through Hybris Backoffice), I noticed a peculiar behavior in SOLR Admin Console. When I select the particular Core and try to go to “Query” section, it is automatically redirect to Dashboard. So, I am unable to use SOLR Admin console to check the data or query over the indexed data. It is happening to all other sections that come after selecting the core. In my earlier version, I never faced such issues.

I actually need to upgrade to this particular version of SOLR since hybris does only support this version after SOLR 7.7.x.

Is there anyone who faced this issue? Please help if anyone knew the solution.

jetty – Solr Service is Running But Connection refused by Solr 8.5.2 on Ubuntu 20.04

Note: I already checked similar questions & tried their solutions but could not resolve this issue.
I have installed Solr 8.5.2 on Ubuntu 20.04 & have created some blank cores for testing.
I was able to access Admin UI at http://139.59.75.45:8983/solr/
But after I tried to upload a document from my desktop solrexamplefilmsfilms.csv it stopped responding.
I checked from terminal, solr service is active but when I tried

curl -X GET "http://139.59.75.45:8983/solr/"

I got following response:
curl: (7) Failed to connect to 139.59.75.45 port 8983: Connection refused.

Note I did not make any changes to core schema before uploading. Still after searching all logs, stackoverflow I could not find the reason. Please advise as to how to resolve this issue.
Following is solr service status output:

root@lemp-02:~# service solr status
● solr.service - LSB: Controls Apache Solr as a Service
     Loaded: loaded (/etc/init.d/solr; generated)
     Active: active (exited) since Sat 2020-12-26 18:22:53 UTC; 9h ago
       Docs: man:systemd-sysv-generator(8)
    Process: 4148 ExecStart=/etc/init.d/solr start (code=exited, status=0/SUCCESS)

Dec 26 18:22:45 lemp-02 systemd(1): Starting LSB: Controls Apache Solr as a Service...
Dec 26 18:22:45 lemp-02 su(4160): (to solr) root on none
Dec 26 18:22:45 lemp-02 su(4160): pam_unix(su-l:session): session opened for user solr by (uid=0)
Dec 26 18:22:53 lemp-02 solr(4225): Started Solr server on port 8983 (pid=4220). Happy searching!
Dec 26 18:22:53 lemp-02 systemd(1): Started LSB: Controls Apache Solr as a Service.

Following is solr.log:

2020-12-26 18:22:47.606 INFO  (main) (   ) o.e.j.u.log Logging initialized @2098ms to org.eclipse.jetty.util.log.Slf4jLog
2020-12-26 18:22:48.045 INFO  (main) (   ) o.e.j.s.Server jetty-9.4.24.v20191120; built: 2019-11-20T21:37:49.771Z; git: 363d5f2df3a8a28de40604320230664b9c793c16; jvm 11.0.9.1+1-Ubuntu-0ubuntu1.20.04
2020-12-26 18:22:48.080 INFO  (main) (   ) o.e.j.d.p.ScanningAppProvider Deployment monitor (file:///opt/solr-8.5.2/server/contexts/) at interval 0
2020-12-26 18:22:48.758 INFO  (main) (   ) o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
2020-12-26 18:22:48.783 INFO  (main) (   ) o.e.j.s.session DefaultSessionIdManager workerName=node0
2020-12-26 18:22:48.783 INFO  (main) (   ) o.e.j.s.session No SessionScavenger set, using defaults
2020-12-26 18:22:48.791 INFO  (main) (   ) o.e.j.s.session node0 Scavenging every 600000ms
2020-12-26 18:22:48.990 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter Using logger factory org.apache.logging.slf4j.Log4jLoggerFactory
2020-12-26 18:22:48.999 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter  ___      _       Welcome to Apache Solrâ„¢ version 8.5.2
2020-12-26 18:22:49.004 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter / __| ___| |_ _   Starting in standalone mode on port 8983
2020-12-26 18:22:49.004 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter __ / _  | '_|  Install dir: /opt/solr
2020-12-26 18:22:49.005 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter |___/___/_|_|    Start time: 2020-12-26T18:22:49.005242Z
2020-12-26 18:22:49.118 INFO  (main) (   ) o.a.s.c.SolrResourceLoader Using system property solr.solr.home: /var/solr/data
2020-12-26 18:22:49.128 INFO  (main) (   ) o.a.s.c.SolrXmlConfig Loading container configuration from /var/solr/data/solr.xml
2020-12-26 18:22:49.281 INFO  (main) (   ) o.a.s.c.SolrXmlConfig MBean server found: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67, but no JMX reporters were configured - adding default JMX reporter.
2020-12-26 18:22:50.611 INFO  (main) (   ) o.a.s.h.c.HttpShardHandlerFactory Host whitelist initialized: WhitelistHostChecker (whitelistHosts=null, whitelistHostCheckingEnabled=true)
2020-12-26 18:22:50.981 WARN  (main) (   ) o.e.j.u.s.S.config Trusting all certificates configured for Client@1cb7936c(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:50.982 WARN  (main) (   ) o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@1cb7936c(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.458 WARN  (main) (   ) o.e.j.u.s.S.config Trusting all certificates configured for Client@35cd68d4(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.458 WARN  (main) (   ) o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@35cd68d4(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.504 WARN  (main) (   ) o.a.s.c.CoreContainer Not all security plugins configured!  authentication=disabled authorization=disabled.  Solr is only as secure as you make it. Consider configuring authentication/authorization before exposing Solr to users internal or external.  See https://s.apache.org/solrsecurity for more info
2020-12-26 18:22:51.890 INFO  (main) (   ) o.a.s.c.TransientSolrCoreCacheDefault Allocating transient cache for 2147483647 transient cores
2020-12-26 18:22:51.893 INFO  (main) (   ) o.a.s.h.a.MetricsHistoryHandler No .system collection, keeping metrics history in memory.
2020-12-26 18:22:52.080 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.node' (registry 'solr.node') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.086 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jvm' (registry 'solr.jvm') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.101 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jetty' (registry 'solr.jetty') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.175 INFO  (main) (   ) o.a.s.c.CorePropertiesLocator Found 2 core definitions underneath /var/solr/data
2020-12-26 18:22:52.176 INFO  (main) (   ) o.a.s.c.CorePropertiesLocator Cores are: (mycollection, test_core)
2020-12-26 18:22:52.386 INFO  (coreLoadExecutor-9-thread-2) (   x:test_core) o.a.s.c.SolrConfig Using Lucene MatchVersion: 8.5.2
2020-12-26 18:22:52.389 INFO  (coreLoadExecutor-9-thread-1) (   x:mycollection) o.a.s.c.SolrConfig Using Lucene MatchVersion: 8.5.2
2020-12-26 18:22:52.539 INFO  (main) (   ) o.e.j.s.h.ContextHandler Started o.e.j.w.WebAppContext@7894f09b{/solr,file:///opt/solr-8.5.2/server/solr-webapp/webapp/,AVAILABLE}{/opt/solr-8.5.2/server/solr-webapp/webapp}
2020-12-26 18:22:52.595 INFO  (main) (   ) o.e.j.s.AbstractConnector Started ServerConnector@66fdec9{HTTP/1.1,(http/1.1, h2c)}{0.0.0.0:8983}
2020-12-26 18:22:52.595 INFO  (main) (   ) o.e.j.s.Server Started @7094ms
2020-12-26 18:22:52.985 INFO  (coreLoadExecutor-9-thread-1) (   x:mycollection) o.a.s.s.IndexSchema (mycollection) Schema name=default-config
2020-12-26 18:22:52.986 INFO  (coreLoadExecutor-9-thread-2) (   x:test_core) o.a.s.s.IndexSchema (test_core) Schema name=default-config

jetty – Solr Service Running But Curl Request Connection refused

I have installed Solr 8.5.2 on Ubuntu 20.04 & have created some blank cores for testing.
I was able to access Admin UI at http://139.59.75.45:8983/solr/
But after I tried to upload a document from my desktop solrexamplefilmsfilms.csv it stopped responding.
I checked from terminal, solr service is active but when I tried
curl -X GET "http://139.59.75.45:8983/solr/"
I got following response: curl: (7) Failed to connect to 139.59.75.45 port 8983: Connection refused.

Note I did not make any changes to core schema before uploading. Still after searching all logs, stackoverflow I could not find the reason. Please advise as to how to resolve this issue.
Following is solr.log:

2020-12-26 18:22:47.606 INFO  (main) (   ) o.e.j.u.log Logging initialized @2098ms to org.eclipse.jetty.util.log.Slf4jLog
2020-12-26 18:22:48.045 INFO  (main) (   ) o.e.j.s.Server jetty-9.4.24.v20191120; built: 2019-11-20T21:37:49.771Z; git: 363d5f2df3a8a28de40604320230664b9c793c16; jvm 11.0.9.1+1-Ubuntu-0ubuntu1.20.04
2020-12-26 18:22:48.080 INFO  (main) (   ) o.e.j.d.p.ScanningAppProvider Deployment monitor (file:///opt/solr-8.5.2/server/contexts/) at interval 0
2020-12-26 18:22:48.758 INFO  (main) (   ) o.e.j.w.StandardDescriptorProcessor NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet
2020-12-26 18:22:48.783 INFO  (main) (   ) o.e.j.s.session DefaultSessionIdManager workerName=node0
2020-12-26 18:22:48.783 INFO  (main) (   ) o.e.j.s.session No SessionScavenger set, using defaults
2020-12-26 18:22:48.791 INFO  (main) (   ) o.e.j.s.session node0 Scavenging every 600000ms
2020-12-26 18:22:48.990 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter Using logger factory org.apache.logging.slf4j.Log4jLoggerFactory
2020-12-26 18:22:48.999 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter  ___      _       Welcome to Apache Solrâ„¢ version 8.5.2
2020-12-26 18:22:49.004 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter / __| ___| |_ _   Starting in standalone mode on port 8983
2020-12-26 18:22:49.004 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter __ / _  | '_|  Install dir: /opt/solr
2020-12-26 18:22:49.005 INFO  (main) (   ) o.a.s.s.SolrDispatchFilter |___/___/_|_|    Start time: 2020-12-26T18:22:49.005242Z
2020-12-26 18:22:49.118 INFO  (main) (   ) o.a.s.c.SolrResourceLoader Using system property solr.solr.home: /var/solr/data
2020-12-26 18:22:49.128 INFO  (main) (   ) o.a.s.c.SolrXmlConfig Loading container configuration from /var/solr/data/solr.xml
2020-12-26 18:22:49.281 INFO  (main) (   ) o.a.s.c.SolrXmlConfig MBean server found: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67, but no JMX reporters were configured - adding default JMX reporter.
2020-12-26 18:22:50.611 INFO  (main) (   ) o.a.s.h.c.HttpShardHandlerFactory Host whitelist initialized: WhitelistHostChecker (whitelistHosts=null, whitelistHostCheckingEnabled=true)
2020-12-26 18:22:50.981 WARN  (main) (   ) o.e.j.u.s.S.config Trusting all certificates configured for Client@1cb7936c(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:50.982 WARN  (main) (   ) o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@1cb7936c(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.458 WARN  (main) (   ) o.e.j.u.s.S.config Trusting all certificates configured for Client@35cd68d4(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.458 WARN  (main) (   ) o.e.j.u.s.S.config No Client EndPointIdentificationAlgorithm configured for Client@35cd68d4(provider=null,keyStore=null,trustStore=null)
2020-12-26 18:22:51.504 WARN  (main) (   ) o.a.s.c.CoreContainer Not all security plugins configured!  authentication=disabled authorization=disabled.  Solr is only as secure as you make it. Consider configuring authentication/authorization before exposing Solr to users internal or external.  See https://s.apache.org/solrsecurity for more info
2020-12-26 18:22:51.890 INFO  (main) (   ) o.a.s.c.TransientSolrCoreCacheDefault Allocating transient cache for 2147483647 transient cores
2020-12-26 18:22:51.893 INFO  (main) (   ) o.a.s.h.a.MetricsHistoryHandler No .system collection, keeping metrics history in memory.
2020-12-26 18:22:52.080 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.node' (registry 'solr.node') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.086 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jvm' (registry 'solr.jvm') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.101 INFO  (main) (   ) o.a.s.m.r.SolrJmxReporter JMX monitoring for 'solr.jetty' (registry 'solr.jetty') enabled at server: com.sun.jmx.mbeanserver.JmxMBeanServer@33065d67
2020-12-26 18:22:52.175 INFO  (main) (   ) o.a.s.c.CorePropertiesLocator Found 2 core definitions underneath /var/solr/data
2020-12-26 18:22:52.176 INFO  (main) (   ) o.a.s.c.CorePropertiesLocator Cores are: (mycollection, test_core)
2020-12-26 18:22:52.386 INFO  (coreLoadExecutor-9-thread-2) (   x:test_core) o.a.s.c.SolrConfig Using Lucene MatchVersion: 8.5.2
2020-12-26 18:22:52.389 INFO  (coreLoadExecutor-9-thread-1) (   x:mycollection) o.a.s.c.SolrConfig Using Lucene MatchVersion: 8.5.2
2020-12-26 18:22:52.539 INFO  (main) (   ) o.e.j.s.h.ContextHandler Started o.e.j.w.WebAppContext@7894f09b{/solr,file:///opt/solr-8.5.2/server/solr-webapp/webapp/,AVAILABLE}{/opt/solr-8.5.2/server/solr-webapp/webapp}
2020-12-26 18:22:52.595 INFO  (main) (   ) o.e.j.s.AbstractConnector Started ServerConnector@66fdec9{HTTP/1.1,(http/1.1, h2c)}{0.0.0.0:8983}
2020-12-26 18:22:52.595 INFO  (main) (   ) o.e.j.s.Server Started @7094ms
2020-12-26 18:22:52.985 INFO  (coreLoadExecutor-9-thread-1) (   x:mycollection) o.a.s.s.IndexSchema (mycollection) Schema name=default-config
2020-12-26 18:22:52.986 INFO  (coreLoadExecutor-9-thread-2) (   x:test_core) o.a.s.s.IndexSchema (test_core) Schema name=default-config

8 – Apache SOLR index or retrieve the node a file is attached to

I’m building a search engine with Apache SOLR (8.7.0) where PDF files are indexed and searched through, that part is working (with search_api_attachments in case you’re wondering). However, the client wants to show a link to the node where this file was referenced from. This is using plain file fields, no media library. Is there a way to either index the node ID with the file, or get this reference somehow in the view with a relationship or something? Spent a good day on it but couldn’t figure it out, maybe I’m just overlooking it. Or is this something that would require me writing a custom hook to add the field to the index? (it seems like quite a general use case to me though)

search – What is the difference between index all on a node and indexing that takes place after node save for a rendered html node on a solr index?

If a page containing panels is indexed as a rendered item, when indexing all the content of the panels is not included in the index. However if you save the node then the content of the panels is added to the index. What is the difference between the process of index all on a node vs indexing that occurs after node save?

7 – Remove duplicated results – Apache Solr

English is not my native language so please be patient.

Good afternoon, my Drupal 7 site is currently holding several articles, help topics, and document guides that are versioned, i.e search the “chocolate” word :

Product Cake:

  • title: Old aunt’s Chocolate Cake. Version: 1.0
  • title: Old aunt’s Chocolate Cake. Version: 1.2
  • title: Colombian Cake, Coffee, and Chocolate. Version: 1.0

Product Candy:

  • title: Chocolate candy for fitness. version: 1.0
  • title: Gretel’s Chocolate recipe. version: 2.2
  • title: Gretel’s Chocolate recipe. version: 3.0
  • title: Gretel’s Chocolate recipe. version: 1.0

Product Beverage:

  • title: Milk & Chocolate. version: 1.0
  • title: Milkshake for the boys in the yard: Vanilla, Chocolate, and More. version: 1.0
  • title: Milkshake for the boys in the yard: Vanilla, Chocolate, and More. version: 2.1
  • title: Hot Chocolate for Cold Winters. version: 1.0

Currently, the site is using Apache Solr Search and related modules. I haven’t been able to set a hook that would allow removing the duplicate results for versions with the same title but different versions, ideally I would like to have results for the latest version only.

The Apache Solr Sort module supposedly would do the trick, sadly I’m still getting the same results sorted by most recent creation, but due to those articles having (almost) the same score none of the duplicated versions are removed.

To handle the sort by most recent creation date I have modified the score using the creation date.

Also, is possible to achieve the sort using the Apache Solr search module.
admin/config/search/apachesolr/settings/solr/bias?destination=admin/config/search/apachesolr/settings

But to remove duplicate results, still haven’t found a solution. The documentation says that my necessity would be solved implementing Grouping, but no result comes from Apache Solr using the following hook_apachesolr_query_prepare which is based in the Sort module, Is this the correct way to implement Grouping? Is there another option to remove the duplicate versions?

/**
 * Implements hook_apachesolr_query_prepare().
 */
function custom_search_apachesolr_query_prepare($query) {

  $env_id = $query->solr('getId');
  $process_callback = apachesolr_environment_variable_get($env_id, 'process_response_callback', 'apachesolr_search_process_response');
  $group_field = apachesolr_environment_variable_get($env_id, 'group_field', 'title');
  $group_limit = apachesolr_environment_variable_get($env_id, 'group_limit', '10');

  if ($process_callback == "apachesolr_sort_process_response") {
    $query->addParams(
      array(
        'group' => 'true',
        'group.field' => $group_field,
        'group.limit' => $group_limit,
        'group.ngroups' => 'true',
        'group.sort' => 'ds_created desc',
        'group.facet' => 'true',
      )
    );
    // start params are different. You should move similar to the group.limit amount
    if (isset($query->page)) {
      $query->removeParam('start');
    }
    if (isset($query->page)) {
      $query->addParam('group.offset', $query->page * $group_limit);
    }
  }
}

Thanks in advance for your time.

8 – How to upgrade Apache solr

My predecessor set up Apache Solr 4.x with Tomcat years ago for a D7 site. Now, I want to upgrade Apache Solr to the latest version and I read that Apache Solr 8 doesn’t support Tomcat anymore.

I was wondering if someone can point me in the right direction on how to upgrade it? Do I have to kill my entire Apache Solr setup with Tomcat, then install Apache Solr 8?

Thank you!